How to simulate data to be statistically significant?
up vote
4
down vote
favorite
I am in 10th grade and I am looking to simulate data for a machine learning science fair project. The final model will be used on patient data and will predict the correlation between certain times of the week and the effect this has on the medication adherence within the data of a single patient. Adherence values will be binary (0 means they did not take the medicine, 1 means they did). I am looking to create a machine learning model which is able to learn from the relationship between the time of week, and have separated the week into 21 time slots, three for each time of day (1 is Monday morning, 2 is monday afternoon, etc.). I am looking to simulate 1,000 patients worth of data. Each patient will have a 30 weeks worth of data. I want to insert certain trends associated with a time of week and adheren