Graduation Year


Document Type




Degree Name

Doctor of Public Health (Dr.PH.)

Degree Granting Department

Public Health

Major Professor

Henian Chen, Ph.D.

Committee Member

Wei Wang, Ph.D.

Committee Member

Yangxin Huang, Ph.D.

Committee Member

Feng Cheng, Ph.D.

Committee Member

Ellen M. Daley, Ph.D.


inference, non-random trial, propensity score, trial simulation


While randomized controlled trials (RCTs) are widely used as a gold standard in clinical research and public health, they are criticized because of a potential lack of generalizability, as the trial patients may be unrepresentative of the target patient population. Few research addresses how to assess and evaluate the generalizability of RCTs. As we know, patients are rarely selected on a random basis from a well-defined patient population of interest into a clinical trial. Generalizing findings from the RCT samples to the patient population has begun to receive increasing attention. We simulate a patient population with treatment effect size of 0.5 (Cohen’s d) and seven covariates that included gender, health insurance, race, baseline symptoms, comorbidity, age, and motivation. We then compare 6 generalizability indexes (SMD: standardized mean difference, C-statistic, β-index, OVL: overlapping coefficient, KSD: Kolmogorov-Smirnov distance, LD: Lévy Distance) with selected nonrandom trials and random trials from the patient population. Based on our results, we conclude that β-index could work as a reliable generalizability metric, and C-statistic and LD are also acceptable. When we know a trial is biased, the question is how to make inference from the bias sample to the patient population. We use 4 statistical approaches (IPSW: inverse probability-of-selection weights; EB: entropy balance, EVB: external validity bias, SC: subclassification estimator) to adjust the trial bias. Based on our results, EB should not be considered as bias adjustment method, SC over adjust the bias, IPSW could be an option, and EVB seems to be more reliable compared to other metrics, when all the covariates are observed.

Included in

Biostatistics Commons