matchit() is the main function of MatchIt and performs pairing, subset selection, and subclassification with the aim of creating treatment and control groups balanced on included covariates. MatchIt implements the suggestions of Ho, Imai, King, and Stuart (2007) for improving parametric statistical models by preprocessing data with nonparametric matching methods. MatchIt implements a wide range of sophisticated matching methods, making it possible to greatly reduce the dependence of causal inferences on hard-to-justify, but commonly made, statistical modeling assumptions. The software also easily fits into existing research practices since, after preprocessing with MatchIt, researchers can use whatever parametric model they would have used without MatchIt, but produce inferences with substantially more robustness and less sensitivity to modeling assumptions.

This page documents the overall use of matchit(), but for specifics of how matchit() works with individual matching methods, see the individual pages linked in the Details section below.

matchit(formula,
data = NULL,
method = "nearest",
distance = "glm",
distance.options = list(),
estimand = "ATT",
exact = NULL,
mahvars = NULL,
antiexact = NULL,
reestimate = FALSE,
s.weights = NULL,
replace = FALSE,
m.order = NULL,
caliper = NULL,
std.caliper = TRUE,
ratio = 1,
verbose = FALSE,
...)

# S3 method for matchit
print(x, ...)

## Arguments

formula a two-sided formula() object containing the treatment and covariates to be used in creating the distance measure used in the matching. This formula will be supplied to the functions that estimate the distance measure. The formula should be specified as A ~ X1 + X2 + ... where A represents the treatment variable and X1 and X2 are covariates. a data frame containing the variables named in formula and possible other arguments. If not found in data, the variables will be sought in the environment. the matching method to be used. The allowed methods are "nearest" for nearest neighbor matching (on the propensity score by default), "optimal" for optimal pair matching, "full" for optimal full matching, "genetic" for genetic matching, "cem" for coarsened exact matching, "exact" for exact matching, and "subclass" for subclassification. When set to NULL, no matching will occur, but propensity score estimation and common support restrictions will still occur if requested. See the linked pages for each method for more details on what these methods do, how the arguments below are used by each on, and what additional arguments are allowed. the distance measure to be used. Can be either a string containing the name of a distance measure, a vector of already-computed distance measures, or a matrix of pairwise distances. When supplied as a vector, the distance measures should be values whose pairwise difference is the distance between two units, e.g., propensity scores for propensity score matching. When supplied as a matrix, each value should represent the pairwise distance between units. See distance for allowable options. The default is "glm" for propensity scores estimated with logistic regression using glm(). Ignored for some methods; see individual methods pages for information on whether and how the distance measure is used. when distance is specified as a string, an additional argument controlling the link function used in estimating the distance measure. Allowable options depend on the specific distance value specified. See distance for allowable options with each option. The default is "logit", which, along with distance = "glm", identifies the default measure as logistic regression propensity scores. a named list containing additional arguments supplied to the function that estimates the distance measure as determined by the argument to distance. See distance for an example of its use. a string containing the name of the target estimand desired. Can be one of "ATT" or "ATC". Some methods accept "ATE" as well. Default is "ATT". See Details and the individual methods pages for information on how this argument is used. for methods that allow it, for which variables exact matching should take place. Can be specified as a string containing the names of variables in data to be used or a one-sided formula with the desired variables on the right-hand side (e.g., ~ X3 + X4). See the individual methods pages for information on whether and how this argument is used. for methods that allow it, on which variables Mahalanobis distance matching should take place when a distance measure other than "mahalanobis" is used. Usually used to perform Mahalanobis distance matching within propensity score calipers, where the propensity scores are computed using formula and distance. Can be specified as a string containing the names of variables in data to be used or a one-sided formula with the desired variables on the right-hand side (e.g., ~ X3 + X4). See the individual methods pages for information on whether and how this argument is used. for methods that allow it, for which variables anti-exact matching should take place. Anti-exact matching ensures paired individuals do not have the same value of the anti-exact matching variable(s). Can be specified as a string containing the names of variables in data to be used or a one-sided formula with the desired variables on the right-hand side (e.g., ~ X3 + X4). See the individual methods pages for information on whether and how this argument is used. a string containing a method for discarding units outside a region of common support. When a propensity score is estimated or supplied to distance as a vector, the options are "none", "treated", "control", or "both". For "none", no units are discarded for common support. Otherwise, units whose propensity scores fall outside the corresponding region are discarded. Can also be a logical vector where TRUE indicates the unit is to be discarded. Default is "none" for no common support restriction. See Details. if discard is not "none" and propensity scores are estimated, whether to re-estimate the propensity scores in the remaining sample. Default is FALSE to use the propensity scores estimated in the original sample. an optional numeric vector of sampling weights to be incorporated into propensity score models and balance statistics. Can also be specified as a string containing the name of variable in data to be used or a one-sided formula with the variable on the right-hand side (e.g., ~ SW). Not all propensity score models accept sampling weights; see distance for information on which do and do not, and see vignette("sampling-weights") for details on how to use sampling weights in a matching analysis. for methods that allow it, whether matching should be done with replacement (TRUE), where control units are allowed to be matched to several treated units, or without replacement (FALSE), where control units can only be matched to one treated unit each. See the individual methods pages for information on whether and how this argument is used. Default is FALSE for matching without replacement. for methods that allow it, the order that the matching takes place. Allowable options depend on the matching method but include "largest", where matching takes place in descending order of distance measures; "smallest", where matching takes place in ascending order of distance measures; "random", where matching takes place in a random order; and "data" where matching takes place based on the order of units in the data. When m.order = "random", results may differ across different runs of the same code unless a seed is set and specified with set.seed(). See the individual methods pages for information on whether and how this argument is used. The default of NULL corresponds to "largest" when a propensity score is estimated or supplied as a vector and "data" otherwise. for methods that allow it, the width(s) of the caliper(s) to use in matching. Should be a numeric vector with each value named according to the variable to which the caliper applies. To apply to the distance measure, the value should be unnamed. See the individual methods pages for information on whether and how this argument is used. The default is NULL for no caliper. logical; when a caliper is specified, whether the the caliper is in standard deviation units (TRUE) or raw units (FALSE). Can either be of length 1, applying to all calipers, or of length equal to the length of caliper. Default is TRUE. for methods that allow it, how many control units should be matched to each treated unit in k:1 matching. Should be a single integer value. See the individual methods pages for information on whether and how this argument is used. The default is 1 for 1:1 matching. logical; whether information about the matching process should be printed to the console. What is printed depends on the matching method. Default is FALSE for no printing other than warnings. additional arguments passed to the functions used in the matching process. See the individual methods pages for information on what additional arguments are allowed for each method. Ignored for print. a matchit object.

## Details

Details for the various matching methods can be found at the following help pages:

• method_nearest for nearest neighbor matching

• method_optimal for optimal pair matching

• method_full for optimal full matching

• method_genetic for genetic matching

• method_cem for coarsened exact matching

• method_exact for exact matching

• method_subclass for subclassification

The pages contain information on what the method does, which of the arguments above are allowed with them and how they are interpreted, and what additional arguments can be supplied to further tune the method. Note that the default method with no arguments supplied other than formula and data is 1:1 nearest neighbor matching without replacement on a propensity score estimated using a logistic regression of the treatment on the covariates. This is not the same default offered by other matching programs, such as those in Matching, teffects in Stata, or PROC PSMATCH in SAS, so care should be taken if trying to replicate the results of those programs.

When method = NULL, no matching will occur, but any propensity score estimation and common support restriction will. This can be a simple way to estimate the propensity score for use in future matching specifications without having to reestimate it each time. The matchit() output with no matching can be supplied to summary to examine balance prior to matching on any of the included covariates and on the propensity score if specified. All arguments other than distance, discard, and reestimate will be ignored.

See the distance argument for details on the several ways to specify the distance and link arguments to estimate propensity scores and create distance measures.

When the treatment variable is not a 0/1 variable, it will be coerced to one and returned as such in the matchit() output (see section Value, below). The following rules are used: 1) if 0 is one of the values, it will be considered the control and the other value the treated; otherwise, 2) if the variable is a factor, levels(treat)[1] will be considered control and the other variable the treated; otherwise, 3) sort(unique(treat))[1] will be considered control and the other value the treated. It is safest to ensure the treatment variable is a 0/1 variable.

The discard option implements a common support restriction. It can only be used when a distance measure is estimated or supplied as a vector, i.e., when distance is specified as something other than "mahalanobis" and is not a matrix, and is ignored for some matching methods. When specified as "treated", treated units whose distance measure is outside the range of distance measures of the control units will be discarded. When specified as "control", control units whose distance measure is outside the range of distance measures of the treated units will be discarded. When specified as "both", treated and controls units whose distance measure is outside the intersection of the range of distance measures of the treated units and the range of distance measures of the control units will be discarded. When reestimate = TRUE and distance corresponds to a propensity score-estimating function, the propensity scores are re-estimated in the remaining units prior to being used for matching or calipers.

Caution should be used when interpreting effects estimated with various values of estimand. Setting estimand = "ATT" doesn't necessarily mean the average treatment effect in the treated is being estimated; it just means that for matching methods, treated units will be untouched and given weights of 1 and control units will be matched to them (and the opposite for estimand = "ATC"). If a caliper is supplied or treated units are removed for common support or some other reason (e.g., lacking matches when using exact matching), the actual estimand targeted is not the ATT but the treatment effect in the matched sample. The argument to estimand simply triggers which units are matched to which, and for stratification-based methods (exact matching, CEM, full matching, and subclassification), determines the formula used to compute the stratification weights.

### How Matching Weights Are Computed

Matching weights are computed in one of two ways depending on whether matching was done with replacement or not.

For matching without replacement, each unit is assigned to a subclass, which represents the pair they are a part of (in the case of k:1 matching) or the stratum they belong to (in the case of exact matching, coarsened exact matching, full matching, or subclassification). The formula for computing the weights depends on the argument supplied to estimand. A new stratum "propensity score" (p) is computed as the proportion of units in each stratum that are in the treated group, and all units in that stratum are assigned that propensity score. Weights are then computed using the standard formulas for inverse probability weights: for the ATT, weights are 1 for the treated units and p/(1-p) for the control units; for the ATC, weights are (1-p)/p for the treated units and 1 for the control units; for the ATE, weights are 1/p for the treated units and 1/(1-p) for the control units.

For matching with replacement, units are not assigned to unique strata. For the ATT, each treated unit gets a weight of 1. Each control unit is weighted as the sum of the inverse of the number of control units matched to the same treated unit across its matches. For example, if a control unit was matched to a treated unit that had two other control units matched to it, and that same control was matched to a treated unit that had one other control unit matched to it, the control unit in question would get a weight of 1/3 + 1/2 = 5/6. For the ATC, the same is true with the treated and control labels switched. The weights are computed using the match.matrix component of the matchit() output object.

In each treatment group, weights are divided by the mean of the nonzero weights in that treatment group to make the weights sum to the number of units in that treatment group. If sampling weights are included through the s.weights argument, they will be included in the matchit() output object but not incorporated into the matching weights. match.data(), which extracts the matched set from a matchit object, combines the matching weights and sampling weights.

## Value

When method is something other than "subclass", a matchit object with the following components:

match.matrix

a matrix containing the matches. The rownames correspond to the treated units and the values in each row are the names (or indices) of the control units matched to each treated unit. When treated units are matched to different numbers of control units (e.g., with exact matching or matching with a caliper), empty spaces will be filled with NA. Not included when method is "full", "cem", or "exact".

subclass

a factor containing matching pair/stratum membership for each unit. Unmatched units will have a value of NA. Not included when replace = TRUE.

weights

a numeric vector of estimated matching weights. Unmatched and discarded units will have a weight of zero.

model

the fit object of the model used to estimate propensity scores when distance is specified and not "mahalanobis" or a numeric vector. When reestimate = TRUE, this is the model estimated after discarding units.

X

a data frame of covariates mentioned in formula, exact, mahvars, and antiexact.

call

the matchit() call.

info

information on the matching method and distance measures used.

estimand

the argument supplied to estimand.

formula

the formula supplied.

treat

a vector of treatment status converted to zeros (0) and ones (1) if not already in that format.

distance

a vector of distance values (i.e., propensity scores) when distance is specified and not "mahalanobis" or a matrix.

a logical vector denoting whether each observation was discarded (TRUE) or not (FALSE) by the argument to discard.

s.weights

the vector of sampling weights supplied to the s.weights argument, if any.

exact

a one-sided formula containing the variables, if any, supplied to exact.

mahvars

a one-sided formula containing the variables, if any, supplied to mahvars.

nn

a matrix of the sample sizes of the treated and control groups before and after matching. See summary.matchit() for details.

When method = "subclass", a matchit.subclass object with the same components as above except that match.matrix is excluded and two additional components, q.cut and qn, are included, containing a vector of the distance measure cutpoints used to define the subclasses and a matrix of the subclass sample sizes, respectively. See method_subclass for details.

## References

Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2007). Matching as Nonparametric Preprocessing for Reducing Model Dependence in Parametric Causal Inference. Political Analysis, 15(3), 199–236. doi: 10.1093/pan/mpl013

Ho, D. E., Imai, K., King, G., & Stuart, E. A. (2011). MatchIt: Nonparametric Preprocessing for Parametric Causal Inference. Journal of Statistical Software, 42(8). doi: 10.18637/jss.v042.i08

## Author

Daniel Ho dho@law.stanford.edu; Kosuke Imai imai@harvard.edu; Gary King king@harvard.edu; Elizabeth Stuart estuart@jhsph.edu

Version 4.0.0 update by Noah Greifer noah.greifer@gmail.com

summary.matchit() for balance assessment after matching. plot.matchit() for plots of covariate balance and propensity score overlap after matching.

vignette("MatchIt") for an introduction to matching with MatchIt; vignette("matching-methods") for descriptions of the variety of matching methods and options available; vignette("assessing-balance") for information on assessing the quality of a matching specification; vignette("estimating-effects") for instructions on how to estimate treatment effects after matching; and vignette("sampling-weights") for a guide to using MatchIt with sampling weights.

## Examples

data("lalonde")

# Default: 1:1 NN PS matching w/o replacement

m.out1 <- matchit(treat ~ age + educ + race + nodegree +
married + re74 + re75, data = lalonde)
m.out1
#> A matchit object
#>  - method: 1:1 nearest neighbor matching without replacement
#>  - distance: Propensity score
#>              - estimated with logistic regression
#>  - number of obs.: 614 (original), 370 (matched)
#>  - target estimand: ATT
#>  - covariates: age, educ, race, nodegree, married, re74, re75summary(m.out1)
#>
#> Call:
#> matchit(formula = treat ~ age + educ + race + nodegree + married +
#>     re74 + re75, data = lalonde)
#>
#> Summary of Balance for All Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.5774        0.1822          1.7941     0.9211    0.3774
#> age              25.8162       28.0303         -0.3094     0.4400    0.0813
#> educ             10.3459       10.2354          0.0550     0.4959    0.0347
#> raceblack         0.8432        0.2028          1.7615          .    0.6404
#> racehispan        0.0595        0.1422         -0.3498          .    0.0827
#> racewhite         0.0973        0.6550         -1.8819          .    0.5577
#> nodegree          0.7081        0.5967          0.2450          .    0.1114
#> married           0.1892        0.5128         -0.8263          .    0.3236
#> re74           2095.5737     5619.2365         -0.7211     0.5181    0.2248
#> re75           1532.0553     2466.4844         -0.2903     0.9563    0.1342
#>            eCDF Max
#> distance     0.6444
#> age          0.1577
#> educ         0.1114
#> raceblack    0.6404
#> racehispan   0.0827
#> racewhite    0.5577
#> nodegree     0.1114
#> married      0.3236
#> re74         0.4470
#> re75         0.2876
#>
#>
#> Summary of Balance for Matched Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.5774        0.3629          0.9739     0.7566    0.1321
#> age              25.8162       25.3027          0.0718     0.4568    0.0847
#> educ             10.3459       10.6054         -0.1290     0.5721    0.0239
#> raceblack         0.8432        0.4703          1.0259          .    0.3730
#> racehispan        0.0595        0.2162         -0.6629          .    0.1568
#> racewhite         0.0973        0.3135         -0.7296          .    0.2162
#> nodegree          0.7081        0.6378          0.1546          .    0.0703
#> married           0.1892        0.2108         -0.0552          .    0.0216
#> re74           2095.5737     2342.1076         -0.0505     1.3289    0.0469
#> re75           1532.0553     1614.7451         -0.0257     1.4956    0.0452
#>            eCDF Max Std. Pair Dist.
#> distance     0.4216          0.9740
#> age          0.2541          1.3938
#> educ         0.0757          1.2474
#> raceblack    0.3730          1.0259
#> racehispan   0.1568          1.0743
#> racewhite    0.2162          0.8390
#> nodegree     0.0703          1.0106
#> married      0.0216          0.8281
#> re74         0.2757          0.7965
#> re75         0.2054          0.7381
#>
#> Percent Balance Improvement:
#>            Std. Mean Diff. Var. Ratio eCDF Mean eCDF Max
#> distance              45.7     -239.6      65.0     34.6
#> age                   76.8        4.6      -4.2    -61.1
#> educ                -134.8       20.4      31.2     32.1
#> raceblack             41.8          .      41.8     41.8
#> racehispan           -89.5          .     -89.5    -89.5
#> racewhite             61.2          .      61.2     61.2
#> nodegree              36.9          .      36.9     36.9
#> married               93.3          .      93.3     93.3
#> re74                  93.0       56.8      79.1     38.3
#> re75                  91.2     -800.7      66.3     28.6
#>
#> Sample Sizes:
#>           Control Treated
#> All           429     185
#> Matched       185     185
#> Unmatched     244       0
#>
# 1:1 NN Mahalanobis distance matching w/ replacement and
# exact matching on married and race

m.out2 <- matchit(treat ~ age + educ + race + nodegree +
married + re74 + re75, data = lalonde,
distance = "mahalanobis", replace = TRUE,
exact = ~ married + race)
m.out2
#> A matchit object
#>  - method: 1:1 nearest neighbor matching with replacement
#>  - distance: Mahalanobis
#>  - number of obs.: 614 (original), 263 (matched)
#>  - target estimand: ATT
#>  - covariates: age, educ, race, nodegree, married, re74, re75summary(m.out2)
#>
#> Call:
#> matchit(formula = treat ~ age + educ + race + nodegree + married +
#>     re74 + re75, data = lalonde, distance = "mahalanobis", exact = ~married +
#>     race, replace = TRUE)
#>
#> Summary of Balance for All Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> age              25.8162       28.0303         -0.3094     0.4400    0.0813
#> educ             10.3459       10.2354          0.0550     0.4959    0.0347
#> raceblack         0.8432        0.2028          1.7615          .    0.6404
#> racehispan        0.0595        0.1422         -0.3498          .    0.0827
#> racewhite         0.0973        0.6550         -1.8819          .    0.5577
#> nodegree          0.7081        0.5967          0.2450          .    0.1114
#> married           0.1892        0.5128         -0.8263          .    0.3236
#> re74           2095.5737     5619.2365         -0.7211     0.5181    0.2248
#> re75           1532.0553     2466.4844         -0.2903     0.9563    0.1342
#>            eCDF Max
#> age          0.1577
#> educ         0.1114
#> raceblack    0.6404
#> racehispan   0.0827
#> racewhite    0.5577
#> nodegree     0.1114
#> married      0.3236
#> re74         0.4470
#> re75         0.2876
#>
#>
#> Summary of Balance for Matched Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> age              25.8162       24.8973          0.1284     0.8265    0.0330
#> educ             10.3459       10.5784         -0.1156     0.9046    0.0179
#> raceblack         0.8432        0.8432          0.0000          .    0.0000
#> racehispan        0.0595        0.0595         -0.0000          .    0.0000
#> racewhite         0.0973        0.0973          0.0000          .    0.0000
#> nodegree          0.7081        0.5892          0.2616          .    0.1189
#> married           0.1892        0.1892          0.0000          .    0.0000
#> re74           2095.5737     1780.4016          0.0645     1.8106    0.0429
#> re75           1532.0553     1161.2419          0.1152     2.0101    0.0258
#>            eCDF Max Std. Pair Dist.
#> age          0.1676          0.3400
#> educ         0.1189          0.3038
#> raceblack    0.0000          0.0000
#> racehispan   0.0000          0.0000
#> racewhite    0.0000          0.0000
#> nodegree     0.1189          0.3329
#> married      0.0000          0.0000
#> re74         0.2000          0.2483
#> re75         0.0811          0.2086
#>
#> Percent Balance Improvement:
#>            Std. Mean Diff. Var. Ratio eCDF Mean eCDF Max
#> age                   58.5       76.8      59.5     -6.2
#> educ                -110.3       85.7      48.4     -6.8
#> raceblack            100.0          .     100.0    100.0
#> racehispan           100.0          .     100.0    100.0
#> racewhite            100.0          .     100.0    100.0
#> nodegree              -6.8          .      -6.8     -6.8
#> married              100.0          .     100.0    100.0
#> re74                  91.1        9.7      80.9     55.3
#> re75                  60.3    -1462.3      80.8     71.8
#>
#> Sample Sizes:
#>               Control Treated
#> All            429.       185
#> Matched (ESS)   34.96     185
#> Matched         78.       185
#> Unmatched      351.         0
#>
# 2:1 NN Mahalanobis distance matching within caliper defined
# by a probit pregression PS

m.out3 <- matchit(treat ~ age + educ + race + nodegree +
married + re74 + re75, data = lalonde,
distance = "glm", link = "probit",
mahvars = ~ age + educ + re74 + re75,
caliper = .1, ratio = 2)
m.out3
#> A matchit object
#>  - method: 2:1 nearest neighbor matching without replacement
#>  - distance: Mahalanobis [matching]
#>              Propensity score [caliper]
#>              - estimated with probit regression
#>  - caliper: <distance> (0.029)
#>  - number of obs.: 614 (original), 257 (matched)
#>  - target estimand: ATT
#>  - covariates: age, educ, race, nodegree, married, re74, re75summary(m.out3)
#>
#> Call:
#> matchit(formula = treat ~ age + educ + race + nodegree + married +
#>     re74 + re75, data = lalonde, distance = "glm", link = "probit",
#>     mahvars = ~age + educ + re74 + re75, caliper = 0.1, ratio = 2)
#>
#> Summary of Balance for All Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.5773        0.1817          1.8276     0.8777    0.3774
#> age              25.8162       28.0303         -0.3094     0.4400    0.0813
#> educ             10.3459       10.2354          0.0550     0.4959    0.0347
#> raceblack         0.8432        0.2028          1.7615          .    0.6404
#> racehispan        0.0595        0.1422         -0.3498          .    0.0827
#> racewhite         0.0973        0.6550         -1.8819          .    0.5577
#> nodegree          0.7081        0.5967          0.2450          .    0.1114
#> married           0.1892        0.5128         -0.8263          .    0.3236
#> re74           2095.5737     5619.2365         -0.7211     0.5181    0.2248
#> re75           1532.0553     2466.4844         -0.2903     0.9563    0.1342
#>            eCDF Max
#> distance     0.6413
#> age          0.1577
#> educ         0.1114
#> raceblack    0.6404
#> racehispan   0.0827
#> racewhite    0.5577
#> nodegree     0.1114
#> married      0.3236
#> re74         0.4470
#> re75         0.2876
#>
#>
#> Summary of Balance for Matched Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.5113        0.4932          0.0835     1.0779    0.0253
#> age              26.0721       24.9324          0.1593     0.4302    0.0931
#> educ             10.4144       10.3514          0.0314     0.6279    0.0171
#> raceblack         0.7387        0.7252          0.0372          .    0.0135
#> racehispan        0.0991        0.0946          0.0190          .    0.0045
#> racewhite         0.1622        0.1802         -0.0608          .    0.0180
#> nodegree          0.6667        0.6396          0.0594          .    0.0270
#> married           0.1892        0.2297         -0.1035          .    0.0405
#> re74           3016.7936     2277.2506          0.1513     1.8730    0.0565
#> re75           2023.1731     1525.9838          0.1544     2.0215    0.0434
#>            eCDF Max Std. Pair Dist.
#> distance     0.1441          0.0862
#> age          0.3198          0.9477
#> educ         0.0586          0.7324
#> raceblack    0.0135          0.0565
#> racehispan   0.0045          0.4924
#> racewhite    0.0180          0.3236
#> nodegree     0.0270          0.5725
#> married      0.0405          0.4722
#> re74         0.2117          0.5512
#> re75         0.1081          0.5885
#>
#> Percent Balance Improvement:
#>            Std. Mean Diff. Var. Ratio eCDF Mean eCDF Max
#> distance              95.4       42.5      93.3     77.5
#> age                   48.5       -2.7     -14.5   -102.8
#> educ                  42.9       33.7      50.8     47.4
#> raceblack             97.9          .      97.9     97.9
#> racehispan            94.6          .      94.6     94.6
#> racewhite             96.8          .      96.8     96.8
#> nodegree              75.7          .      75.7     75.7
#> married               87.5          .      87.5     87.5
#> re74                  79.0        4.6      74.9     52.6
#> re75                  46.8    -1474.9      67.6     62.4
#>
#> Sample Sizes:
#>               Control Treated
#> All            429.       185
#> Matched (ESS)  131.78     111
#> Matched        146.       111
#> Unmatched      283.        74
#>
# Optimal full PS matching for the ATE within calipers on
# PS, age, and educ

m.out4 <- matchit(treat ~ age + educ + race + nodegree +
married + re74 + re75, data = lalonde,
method = "full", estimand = "ATE",
caliper = c(.1, age = 2, educ = 1),
std.caliper = c(TRUE, FALSE, FALSE))
m.out4
#> A matchit object
#>  - method: Optimal full matching
#>  - distance: Propensity score [caliper]
#>              - estimated with logistic regression
#>  - caliper: <distance> (0.029), age (2), educ (1)
#>  - number of obs.: 614 (original), 314 (matched)
#>  - target estimand: ATE
#>  - covariates: age, educ, race, nodegree, married, re74, re75summary(m.out4)
#>
#> Call:
#> matchit(formula = treat ~ age + educ + race + nodegree + married +
#>     re74 + re75, data = lalonde, method = "full", estimand = "ATE",
#>     caliper = c(0.1, age = 2, educ = 1), std.caliper = c(TRUE,
#>         FALSE, FALSE))
#>
#> Summary of Balance for All Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.5774        0.1822          1.7569     0.9211    0.3774
#> age              25.8162       28.0303         -0.2419     0.4400    0.0813
#> educ             10.3459       10.2354          0.0448     0.4959    0.0347
#> raceblack         0.8432        0.2028          1.6708          .    0.6404
#> racehispan        0.0595        0.1422         -0.2774          .    0.0827
#> racewhite         0.0973        0.6550         -1.4080          .    0.5577
#> nodegree          0.7081        0.5967          0.2355          .    0.1114
#> married           0.1892        0.5128         -0.7208          .    0.3236
#> re74           2095.5737     5619.2365         -0.5958     0.5181    0.2248
#> re75           1532.0553     2466.4844         -0.2870     0.9563    0.1342
#>            eCDF Max
#> distance     0.6444
#> age          0.1577
#> educ         0.1114
#> raceblack    0.6404
#> racehispan   0.0827
#> racewhite    0.5577
#> nodegree     0.1114
#> married      0.3236
#> re74         0.4470
#> re75         0.2876
#>
#>
#> Summary of Balance for Matched Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.3515        0.3475          0.0176     1.0152    0.0145
#> age              22.4217       22.0276          0.0431     0.8438    0.0155
#> educ             10.7009       10.6448          0.0227     1.0746    0.0102
#> raceblack         0.4618        0.4586          0.0083          .    0.0032
#> racehispan        0.1465        0.1079          0.1294          .    0.0386
#> racewhite         0.3917        0.4335         -0.1055          .    0.0418
#> nodegree          0.5792        0.5823         -0.0066          .    0.0031
#> married           0.2862        0.2673          0.0422          .    0.0189
#> re74           2687.3378     2948.8704         -0.0442     1.0844    0.0401
#> re75           1444.8728     1750.7094         -0.0939     1.2107    0.0528
#>            eCDF Max Std. Pair Dist.
#> distance     0.0625          0.0449
#> age          0.1091          0.1233
#> educ         0.0557          0.2067
#> raceblack    0.0032          0.0325
#> racehispan   0.0386          0.5009
#> racewhite    0.0418          0.3666
#> nodegree     0.0031          0.2018
#> married      0.0189          0.5267
#> re74         0.1920          0.5462
#> re75         0.1508          0.6766
#>
#> Percent Balance Improvement:
#>            Std. Mean Diff. Var. Ratio eCDF Mean eCDF Max
#> distance              99.0       81.6      96.2     90.3
#> age                   82.2       79.3      81.0     30.8
#> educ                  49.2       89.7      70.6     50.0
#> raceblack             99.5          .      99.5     99.5
#> racehispan            53.3          .      53.3     53.3
#> racewhite             92.5          .      92.5     92.5
#> nodegree              97.2          .      97.2     97.2
#> married               94.2          .      94.2     94.2
#> re74                  92.6       87.7      82.2     57.1
#> re75                  67.3     -327.8      60.7     47.6
#>
#> Sample Sizes:
#>               Control Treated
#> All            429.    185.
#> Matched (ESS)  137.58   40.02
#> Matched        203.    111.
#> Unmatched      226.     74.
#>
# Subclassification on a logistic PS with 10 subclasses after
# discarding controls outside common support of PS

s.out1 <- matchit(treat ~ age + educ + race + nodegree +
married + re74 + re75, data = lalonde,
method = "subclass", distance = "glm",
discard = "control", subclass = 10)
s.out1
#> A matchit object
#>  - method: Subclassification (10 subclasses)
#>  - distance: Propensity score [common support]
#>              - estimated with logistic regression
#>  - common support: control units dropped
#>  - number of obs.: 614 (original), 557 (matched)
#>  - target estimand: ATT
#>  - covariates: age, educ, race, nodegree, married, re74, re75summary(s.out1)
#>
#> Call:
#> matchit(formula = treat ~ age + educ + race + nodegree + married +
#>     re74 + re75, data = lalonde, method = "subclass", distance = "glm",
#>     discard = "control", subclass = 10)
#>
#> Summary of Balance for All Data:
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.5774        0.1822          1.7941     0.9211    0.3774
#> age              25.8162       28.0303         -0.3094     0.4400    0.0813
#> educ             10.3459       10.2354          0.0550     0.4959    0.0347
#> raceblack         0.8432        0.2028          1.7615          .    0.6404
#> racehispan        0.0595        0.1422         -0.3498          .    0.0827
#> racewhite         0.0973        0.6550         -1.8819          .    0.5577
#> nodegree          0.7081        0.5967          0.2450          .    0.1114
#> married           0.1892        0.5128         -0.8263          .    0.3236
#> re74           2095.5737     5619.2365         -0.7211     0.5181    0.2248
#> re75           1532.0553     2466.4844         -0.2903     0.9563    0.1342
#>            eCDF Max
#> distance     0.6444
#> age          0.1577
#> educ         0.1114
#> raceblack    0.6404
#> racehispan   0.0827
#> racewhite    0.5577
#> nodegree     0.1114
#> married      0.3236
#> re74         0.4470
#> re75         0.2876
#>
#> Summary of Balance Across Subclasses
#>            Means Treated Means Control Std. Mean Diff. Var. Ratio eCDF Mean
#> distance          0.5774        0.5710          0.0293     0.9338    0.0158
#> age              25.8162       25.3714          0.0622     0.4577    0.0866
#> educ             10.3459       10.4094         -0.0316     0.6894    0.0150
#> raceblack         0.8432        0.8262          0.0469          .    0.0170
#> racehispan        0.0595        0.0676         -0.0343          .    0.0081
#> racewhite         0.0973        0.1062         -0.0302          .    0.0089
#> nodegree          0.7081        0.6782          0.0658          .    0.0299
#> married           0.1892        0.1785          0.0274          .    0.0107
#> re74           2095.5737     2232.5096         -0.0280     1.3102    0.0449
#> re75           1532.0553     1643.4179         -0.0346     1.4216    0.0472
#>            eCDF Max
#> distance     0.0541
#> age          0.3043
#> educ         0.0425
#> raceblack    0.0170
#> racehispan   0.0081
#> racewhite    0.0089
#> nodegree     0.0299
#> married      0.0107
#> re74         0.2731
#> re75         0.1841
#>
#> Percent Balance Improvement:
#>            Std. Mean Diff. Var. Ratio eCDF Mean eCDF Max
#> distance              98.4       -1.4      95.8     91.6
#> age                   79.9       -4.0      -6.4    -92.9
#> educ                  42.6      -39.0      56.7     61.9
#> raceblack             97.3          .      97.3     97.3
#> racehispan            90.2          .      90.2     90.2
#> racewhite             98.4          .      98.4     98.4
#> nodegree              73.1          .      73.1     73.1
#> married               96.7          .      96.7     96.7
#> re74                  96.1     -152.9      80.0     38.9
#> re75                  88.1      -48.7      64.8     36.0
#>
#> Sample Sizes:
#>               Control Treated
#> All            429.       185
#> Matched (ESS)   72.59     185
#> Matched        372.       185
#> Unmatched        0.         0