Let y1 represent the value of the outcome when unit i is subject to regime 1 (called treatment), and Yj0 the value of the outcome when unit i is exposed to regime 0 (called control). Only one of Ую or y1 can be observed for any unit, since we can not observe the same unit under both treatment and control. Let T. be a treatment indicator (=1 if exposed to treatment, =0 otherwise). Then the observed outcome for unit i is Y. = Tya + (1-T)Ya. The treatment effect for unit i is

t = Y – Y

Li i1 i 0 ‘

In an observational study, the treatment and comparison groups are often drawn from different populations. In our application the group exposed to the treatment is drawn from the population of interest (welfare recipients eligible for the program). The comparison group is drawn from a different population (in our application both the CPS and PSID are more representative of the general US population). The treatment effect we are trying to identify is therefore the treatment effect for the treated population:

This cannot be estimated directly since Yj0 is not observed for the treated units. Assuming selection on observables (Rubin 1974, 1977), namely {YiV Ya -4-7)}\X. (using Dawid’s notation, -11 is independence), we obtain:

for j=0,1. Conditional on the observables, X, there is no systematic pre-treatment difference between the groups assigned to treatment and control. This allows us to identify the treatment effect for the treated:

where the outer expectation is over the distribution of X1\T1=1, the distribution of pre-intervention variables in the treated population.

One method for estimating the treatment effect stems from (1): estimating E(Y| X., T = 1) and E(Y | X., T = 0)as two non-parametric equations. This estimation strategy becomes difficult, however, if the covariates, Xi, are high dimensional. The propensity score theorem provides an intermediate step:

Proposition 1 (Rosenbaum and Rubin 1983): Let p(X) be the probability of unit i having been assigned to treatment, defined asp(X)°Pr(Ti=1/X)=E(TjX), where 0<p(X)<1, “i. Then:

One intuition for the propensity score is that, whereas in equation (1) we are trying to condition on X (intuitively, to find observations with similar covariates), in equation (2) we are trying to condition just on the propensity score, because the proposition implies that observations with the same propensity score have the same distribution of the full vector of covariates X. payday loans online direct lenders only