Propensity score matching implicitly weighs the matched treated observations to compute counterfactual outcomes.
The Stata command
-psmatch2- stores these weights in a variable called
Someone pointed me to an old blog post somewhere on the Internet, which shows that there may be some confusion about what these weights are and where they come from.
K-neighbor matching estimates the counterfactual outcome for a treated observation by averaging the outcomes of its
This means that every time an untreated observation is matched to a treated observation (and this can happen more than once when matching with replacement), it is used with “weight”
1/K since one is dividing by
K when averaging.
If one uses a caliper (i.e. excludes matches that are farther away than a minimum distance called a “caliper”) it can happen that some matches involve less than
So more generally the weight is not
1/K but rather
(-psmatch2- saves the nr of matches for a given treated observation in the variable
_weight sums these weights every time a control observation is used to construct a counterfactual outcome.
So let’s say that we are matching two treated observations to two neighbors with a caliper.
Then we may have that the first treated has two matches and the second treated only one match as in the following:
_id _treated _n1 _n2 _nn
1 1 3 4 2
2 1 3 . 1
3 0 . . .
4 0 . . .
The matched outcome for the first treated will be averaged across observations 3 and 4 and these have thus each weight 1/2 here.
The matched outcome for the second treated obs will be averaged across observation 3 and which thus has weight 1.
Note that in each case the weights equal
Putting this together we can compute how often each matched untreated observation is used to construct the overall average counterfactual outcome by summing their weights:
For the example in the blog-post above the following code shows that this indeed gives the weights in the variable
webuse cattaneo2, clear
set seed 795
psmatch2 mbsmoke prenatal1 fbaby mmarried medu fedu mage fage mrace frace, out(bweight) neighbor(5) caliper(.0295236) logit
rename _n* N* // otherwise reshape complains
reshape long N, i(_id) j(matchnr)
g altweight = 1 / Nn
collapse (sum) altweight, by(N)
The weights in
_weight are therefore not specific to
-psmatch2-, but they follow directly from the definition of a
K-neighbor matching estimator (independently of whether one matches on the propensity score or something else).