This paper discusses the estimation of treatment effects in observational studies. This issue, which is of great practical importance because randomized experiments cannot always be implemented, has been addressed previously by Lalonde (1986), whose data we use in this paper. Lalonde estimates the impact of the National Supported Work (NSW) Demonstration, a labor training program, on post-intervention income levels, using data from a randomized evaluation of the program. He then examines the extent to which non-experimental estimators can replicate the unbiased experimental estimate of the treatment impact, when applied to a composite data set of experimental treatment units and non-experimental comparison units. He concludes that standard non-experimental estimators, such as regression, fixed-effect, and latent-variable-selection models, are either inaccurate (relative to the experimental benchmark), or sensitive to the specification used in the regression. Lalonde’s results have been influential in renewing the debate on experimental versus non-experimental evaluations (see Manski and Garfinkel 1992) and in spurring a search for alternative estimators and specification tests (e.g., Heckman and Hotz 1989; and Manski, Sandefur, McLanahan, and Powers 1992).
In this paper, we apply propensity score methods (Rosenbaum and Rubin 1983) to Lalonde’s data set. Propensity score methods focus on the comparability of the treatment and non-experimental comparison groups in terms of pre-intervention variables. Controlling for differences in pre-intervention variables is difficult when the treatment and comparison groups are dissimilar and when there are many pre-intervention variables. credit