**Figure S1.**
Systematically generated evidence from observational data.
Each dot represents a calibrated hazard ratio and confidence interval for a comparison of two
depression treatments with respect to an outcome of interest in one of the four databases. Use
the controls on the left to filter the result set. After selecting an estimate, details will be shown below.

**Table S1.1.**
Counts of subjects, person-days and outcomes in the target and comparator population.

**Figure S1.1.**
Hazard ratios and confidence intervals (CI) across the databases, both
calibrated (top) and uncalibrated (bottom). Blue indicates the CI includes one, orange indicates
the CI does not include one.

**Figure S1.2.**
Preference score distribution. The preference score is a transformation of the propensity score
that adjusts for differences in the sizes of the two treatment groups. A higher overlap indicates subjects in the
two groups were more similar in terms of their predicted probability of receivind one treatment over the other.

**Figure S1.3.**
Covariate balance before and after stratification.
Each dot represents the standardizes difference in means for a single covariate before and after stratifying on the propensity score.
Move the mouse arrow over a dot for more details.

**Figure S1.4.**
Hazard ratios and corresponding standard errors for our negative and positive controls. The estimates are stratified by the true hazard ratio

**Figure S1.5.**
Hazard ratios and corresponding standard errors after empirical calibration for our negative and positive controls. The estimates are stratified by the true hazard ratio

**Table S1.2.**
Counts of subjects, person-days and outcomes in the target and comparator population.

**Figure S1.6.**
Hazard ratios and confidence intervals (CI) across the databases. Blue indicates the CI includes one, orange indicates
the CI does not include one.

**Figure S2**
: Evidence in literature.
Each dot represents an effect size and confidence interval or p-value as extracted from the scientific
literature. Use the controls on the left to filter the result set. After selecting an estimate, the
abstract will be shown below with the location of the estimate highlighted.

Supplementary data for:

Schuemie MJ, Ryan PB, Hripcsak G, Madigan D, Suchard MA,
*Improving reproducibility using high-throughput observational studies with empirical calibration.*
, Phil. Trans. R. Soc. A, 2018