SYNTHETIC SAMPLE GENERATION OF THE 4S STUDY PLACEBO POPULATION USING A STOCHASTIC RESAMPLING TECHNIQUE
Author(s)
Martin C1, Springate C2
1Crystallise, East Tilbury, UK, 2Crystallise, East Tilbury, ESS, UK
Presentation Documents
OBJECTIVES: To generate a synthetic sample of individuals with mean characteristics that reflect those of a population in clinical research. METHODS: In R we developed a stochastic resampling technique to generate a semi-random sample of people with characteristics that match those of the control group in the Scandinavian Simvastatin Survival Study (4S). The sample was matched on binary variables (gender ratio, age, smoking status, diabetes) and on continuous factors (BMI, systolic blood pressure, total cholesterol: HDL cholesterol ratio, number of cigarettes smoked per day and units of alcohol consumed per week). The descriptive statistics generated for the synthetic sample matched the target sample to an accuracy of 2 decimal points. RESULTS: The algorithm successfully generated a sample of 2,222 individuals with characteristics closely matching those of the 4S study control group. The only notable difference in the data summary was that the range of TC in the 4S study control group was 5.01 to 25 whereas in the synthetic sample the generated range was 5.01 to 12. The samples were well matched for all continuous variables. The average values reported in the 4S study for BMI, systolic BP, TC, and HDL were 26.0 (SD = 3.3), 139.1 (SD = 19.6), 6.7 (SD = 0.7), and 1.4 (SD = 0.3), respectively. From the synthetic sample the average values were 26.0 (SD = 4.2, 95% CI [17.84, 34.18]), 139.1 (SD = 20.1, 95% CI [99.62, 178.52]), 6.7 (SD = 1.1, 95% CI [4.54, 8.94]), and 1.4 (SD = 0.3, 95% CI [0.75, 2.06]), respectively. CONCLUSIONS: This new method was successful in generating synthetic samples that are comparable to the originals in aggregate. These synthetic samples can be used to model the likely impact of new therapies or predict mortality for various sub-groups and will be a useful tool in the planning and preparation of clinical trials.
Conference/Value in Health Info
2018-11, ISPOR Europe 2018, Barcelona, Spain
Value in Health, Vol. 21, S3 (October 2018)
Code
PRM114
Topic
Methodological & Statistical Research
Topic Subcategory
Modeling and simulation
Disease
Multiple Diseases