2 min read

External Validity: Conclusion

Conclusion

In these blog posts, following Erin Hartman’s chapter in Advances in Experimental Political Science titled “Generalizing Experimental Results,” I have investigated how one might go about generalizing an experimentally identified causal effect to a target population. Throughout my investigation, I have adopted Hartman’s case study of a field study in Liberia, in which researchers conduct an RCT in villages in the north, but are interested in estimating an effect for the entire country.

In terms of identification, the important takeaway for me was that it requires one to have a “valid adjustment set” of covariates, i.e. covariates which 1. separate the sampling indicator from the effect modifier in the DAG, 2. are measured among the individuals in the experimental sample, and 3. have a known distribution (or, better yet, is measured) among individuals in the target population.

In terms of estimation, I have simulated hypothetical data from the field study with discrete adjustment covariates and evaluated how an estimator based on direct standardization performs empirically. A more general weighted Horwitz-Thompson estimator is equivalent to the direct standardization estimator when the adjustment covariates are discrete, but, unlike the latter, can also be implemented when the adjustment covariates are continuous. The weights for this estimator are estimated with a binary regression of the sampling indicator on the adjustment set, analogous to the estimation of propensity scores when you’re doing internal causal identification.

Finally, in terms of sensitivity analysis, I have referred to a framework which, in the presence of an unmeasured variable that both moderates the treatment effect and influences who’s included in the sample, allows one to estimate an upper-bound on the bias of some Horwitz-Thompson weighted estimator which has ignored this unmeasured variable in estimating the sampling weights. Sensitivity analysis will be crucial in any generalization effort since, except for special cases, it’s somewhat unlikely that researchers will be able to exhaustively list all the effect-moderating-sampling-influencers, let alone know how they’re distributed in the target population.