Matching methods for causal inference match similar units together before estimating treatment effects from observational data, in order to reduce the bias introduced by confounding variables. Crucially, a distance measure must be chosen that determines what it means for units to be “similar”. While many matching methods match units based off one-dimensional summaries of their covariates (such as propensity or prognostic scores), Almost Matching Exactly (AME) methods match units that are directly similar in their covariates, with similarity on more important covariates prioritized. This results in matches that are both interpretable and high quality. For more details on Almost Matching Exactly, visit our website here.
This package has implementations for two algorithms in the AME framework that are designed for discrete observational data (that is, with discrete, or categorical, covariates): FLAME (Fast, Large-scale Almost Matching Exactly) and DAME (Dynamic Almost Matching Exactly). FLAME and DAME are efficient algorithms that match units via a learned, weighted Hamming distance that determines which covariates are more important to match on. DAME finds optimal sets of covariates that units can match on, where optimality is determined by how important covariates are. FLAME approximates the DAME solution, greedily searching for good covariate sets. More specifically, for each treated unit t, DAME solves the following optimization problem:
$$
\text{argmax}_{\boldsymbol{\theta} \in \{0,
1\}^p}\;\boldsymbol{\theta}^T\mathbf{w}\quad\text{s.t.}\\\quad \exists
\ell\;\:\text{with}\;\: T_{\ell} = 0 \;\:\text{and}\;\:
\mathbf{x}_{\ell} \circ \boldsymbol{\theta} = \mathbf{x}_{t} \circ
\boldsymbol{\theta}
$$ where ∘ denotes the Hadamard
product, Tℓ denotes
treatment of unit ℓ, and xt ∈ ℝp
denotes the covariates of unit t. That is, given fixed covariate
importances or weights w, DAME finds the highest
weight covariate set for which t can be matched to at least one
control unit. In practice, DAME and FLAME do not iterate over units, but
rather over covariate sets. DAME solves this problem efficiently through
a downward closure property: because matching on an additional covariate
will never be detrimental, DAME only considers matching on a covariate
set if it has already considered matching on all of its supersets. FLAME
approximates the solution to this problem via a greedy, backwards
stepwise selection procedure: it iteratively matches units on a set of
covariates and then drops a covariate from future consideration.
Typically, in practice, the weights w are determined
dynamically; for each set of covariates under consideration, a
regression is run on a held out dataset to predict how well they jointly
predict the outcome. Their error in doing so is known as the
Predictive Error or P
E
. To ensure that
FLAME’s greedy approach also makes a reasonable number of matches, out
of the covariate sets under consideration, FLAME matches on the one
maximizing match quality M
Q
, defined as
M
Q
:= C ⋅ B
F
− P
E
,
where C is a hyperparameter.
The balancing factor, B
F
, at an iteration
is defined as the proportion of control units, plus the proportion of
treated units, that are matched by the update to the next covariate set.
In this way, FLAME encourages making many matches (lowering variance of
treatment effect estimates) and matching on covariates important to the
outcome (lowering bias of treatment effect estimates).
In both algorithms, the main matched group (MMG) of a unit i is defined as the units it first matches to; that is, that it matches to on the highest quality covariate set considered. Both algorithms allow for matching with replacement (in which units can match on multiple different covariate sets); this allows units to belong to multiple matched groups (MGs) – once in their MMG and several times more in the MMGs of other units. For more details, please see (Wang et al. 2020) linked here and (Dieng et al. 2019) linked here.
To illustrate the basic package functionality, we’ll work with some
toy data we can create using the included FLAME::gen_data
function.
set.seed(24)
library(FLAME)
n <- 500 # Observations
p <- 5 # Covariates
# Data to match
data <- gen_data(n, p)
# Data to compute PE and determine which covariate sets to match on
holdout <- gen_data(n, p)
Note that all our covariates are categorical because FLAME and DAME are designed to work with discrete data:
head(data[, 1:p])
#> X1 X2 X3 X4 X5
#> 1 1 2 2 1 4
#> 2 2 3 3 3 1
#> 3 3 2 1 3 1
#> 4 2 1 2 1 2
#> 5 3 3 1 4 2
#> 6 2 2 2 3 1
While the covariates needn’t be coded as factors before being passed
to FLAME
or DAME
, all covariates will be
assumed to be categorical. A user may of course bin continuous variables
prior to passing them to FLAME
or DAME
;
however, we generally caution against this approach as naive coarsenings
of continuous variables can fail to properly adjust for bias when
estimating treatment effects. A coarsening may be acceptable in certain
domain-specific contexts if performed by a domain expert. Otherwise,
there are other algorithms in the AME framework that allow mixed
discrete and continuous covariates. For more details, see (Parikh, Rudin, and Volfovsky 2018) linked here and (Morucci et al. 2020) linked here.
In addition to the covariates to match on, data
contains
an outcome and a treated column:
These can have any names if they are appropriately relayed to the
matching algorithm via treated_column_name
or
outcome_column_name
, but default to 'treated'
and 'outcome'
.
The outcome may be continuous (if numeric with more than two unique values), binary (if a two-level factor or numeric with unique values 0 and 1), or multi-class (if a factor with more than two levels). The outcome may also be omitted from the matching data, but it must be included in the holdout data as it is required to learn covariate importance. Both FLAME and DAME focus on binary treatments and the treatment column must accordingly be coded either as logical or as binary numeric.
Given data of this general form, we can run FLAME and DAME with their
default parameters. This will match units on the covariates – here,
X1
, X2
, X3
, X4
,
X5
– and output information about the matches that were
made. Here, we call FLAME
, though the call to
DAME
is analogous.
FLAME_out <- FLAME(data = data, holdout = holdout)
#> FLAME stopping: predictive error would have risen 25% above the baseline.
The resulting stopping message informs us that FLAME stopped dropping covariates once they became too important for predicting the outcome; see the Early Stopping Arguments subsection below for more details.
FLAME
and DAME
return a list of class
ame
, which by default contains 4 entries:
The first, FLAME_out$data
contains the original data
frame with several modifications:
FLAME_out$data$matched
, that indicates whether or not a
unit was matched. This can be useful if, for example, you’d like to use
only the units that were matched for subsequent analysis:There is an extra numeric column,
FLAME_out$data$weight
that denotes on how many different
sets of covariates a unit was matched. By default, this will be 1 if a
unit is matched and 0 otherwise. With the replace = TRUE
argument, however, units are allowed to match several times on multiple
sets of covariates and their values for weight
can
therefore be greater than 1. These weights can be used when doing
regression adjustment after matching to estimate treatment
effects.
The columns denoting treatment and outcome in the data will be moved after all covariate columns (this is likely to change in a future release).
If replace = FALSE
, there will be an extra column
containing a matched group identifier for each unit. All units with the
same identifier share the same main matched group.
If estimate_CATEs = TRUE
, there will be an extra
column containing a CATE estimate for each unit.
The second, MGs
is a list, each entry of which contains
the units in a single matched group.
That is, the first unit in the dataset was matched to unit 84.
The third, cov_sets
is a list whose i’th entry contains
the covariate set not matched on in the i’th iteration.
Thus, we can see that FLAME first matched exactly on all covariates,
then dropped covariate X5
, then covariates X4
and X5
, and then terminated.
And the fourth, info
, is a list containing miscellaneous
information about the matching specifications. This is primarily used
internally by *.ame
methods.
After either FLAME
or DAME
has been run,
the resulting ame
object and the associated matched data
can be analyzed in a variety of ways. The FLAME
package
provides functionality for a few quick, post-matching analyses, via
print
, plot
, and summary
methods
and the functions MG
and CATE
.
Printing an ame
object will give some information about
the matching procedure and estimate an average treatment effect, if
possible:
print(FLAME_out)
#> An object of class `ame`:
#> FLAME ran for 3 iterations, matching 434 out of 500 units without
#> replacement.
#> The average treatment effect of treatment `treated` on outcome `outcome` is
#> estimated to be 4.8666474.
Summarizing an ame
object gives additional information
on 1. the number of matches formed, 2. average treatment effect
estimates, and 3. the matched groups formed:
(FLAME_summ <- summary(FLAME_out))
#> Number of Units:
#> Control Treated
#> All 244 256
#> Matched 214 220
#> Unmatched 30 36
#>
#> Average Treatment Effects:
#> Mean Variance
#> All 4.87 0.00941
#> Treated 4.86 0.0105
#> Control 4.87 0.0105
#>
#> Matched Groups:
#> Number 128
#> Median size 3
#> Highest quality: 228 and 27
In the above, the ‘highest quality’ matched groups – 228 and 27 – are the two largest made on the highest quality covariate set. Manual inspection of these groups and their CATE estimates can be useful.
Plotting an ame
object by default outputs 4 plots, any
subset of which can be selected via the which_plots
argument. They illustrate:
return_PE = TRUE
to the original FLAME
or
DAME
call. If, instead, there were many matches on large
numbers of covariates and also on small numbers, a user may wish to
rerun the algorithm with an additional early stopping constraint to
ensure that a greater fraction of the matches are high quality.FLAME_out$cov_sets
, below, we see that on Iteration 1
matches were made on all variables, on Iteration 2 they were made on all
variables but X5
, and on Iteration 3 they were made on all
variables but X5
and X4
.Lastly, the functions MG
and CATE
take in
vectors of units and return their matched groups and CATE estimates.
Based off the summary
above, we can run these on some units
belonging to high quality matched groups:
high_quality <- FLAME_summ$MG$highest_quality
MG(high_quality, FLAME_out)
#> [[1]]
#> X1 X2 X3 X4 X5 treated outcome
#> 228 3 3 3 3 2 1 34.40411
#> 236 3 3 3 3 2 0 30.61004
#> 309 3 3 3 3 2 0 30.92045
#> 351 3 3 3 3 2 0 30.23918
#> 358 3 3 3 3 2 1 33.57794
#> 473 3 3 3 3 2 1 34.68919
#> 490 3 3 3 3 2 0 30.06894
#>
#> [[2]]
#> X1 X2 X3 X4 X5 treated outcome
#> 27 2 3 2 3 3 1 14.52208
#> 36 2 3 2 3 3 1 15.02916
#> 150 2 3 2 3 3 0 12.22630
#> 163 2 3 2 3 3 0 10.02043
#> 360 2 3 2 3 3 0 11.62386
#> 411 2 3 2 3 3 0 10.40921
CATE(high_quality, FLAME_out)
#> Mean Variance
#> 228 3.764094 0.1473901
#> 27 3.705668 0.3294528
Each element in the output of MG
corresponds to a
different input unit. In this case, we see that both units matched on
all covariates, which is what makes them high quality.
FLAME and DAME are implemented using efficient bit-vectors
computations, which makes them scalable to datasets large in both n and p. Nevertheless, runtime may be
appreciable (on the order of hours) for datasets with hundreds of
thousands of rows and dozens of covariates. The most expensive tasks
that contribute to this runtime are 1. matching units given a set of
covariates and 2. finding the next set of covariates to match on. While
there is not much that can be done about 1. there is a good deal of
flexibility the user has in controlling the time taken by 2. In this
step, every covariate set eligible to be matched on is used to predict
the outcome on the holdout data; assuming matching is not terminated
early, this entails running O(p2) many
regressions for FLAME and O(2p) for DAME.
Furthermore, both of the regression methods implemented in this package
for computing PE, PE_method = 'ridge'
and
PE_method = 'xgb'
perform cross-validation in order to
ensure that the regressions are of high quality, imposing additional
computational cost on the procedure. However, if a user believes that
more efficient methods (e.g. vanilla linear regression) or different
specifications of the implemented methods (e.g. fewer folds of cross
validation or fewer parameters to cross validate over) are sufficient to
predict the outcome well, then a function can be supplied to
PE_method
giving the user complete control over the
prediction procedure and allowing for considerable speedups. Examples
are given in the section Computing Predictive Error below.
If a user wishes to match using DAME on a dataset with many
covariates, a reasonable option may be to start by using FLAME for a few
iterations to rapidly eliminate irrelevant covariates and then switch to
DAME to generate higher quality matches on higher quality sets of
covariates. This can be done by supplying the n_flame_iters
argument to DAME
. Setting this argument to 5, for example,
rapidly eliminates 5 covariates via FLAME’s greedy backwards selection
procedure and then switches to DAME. In the presence of irrelevant
covariates (which are likely to exist in high-dimensional settings),
such an approach should yield nearly identical treatment effect
estimates much more quickly.
Our implementation of FLAME and DAME runs a different regression for
each covariate set under consideration, to ensure that it is capable of
predicting the outcome well. However, if a user already has measures of
covariate importance that they would like to use within the matching
procedure, they can pass these to the algorithm via the
weights
parameter. In this case, 0
regressions have to be run because a sequence of covariate sets is
already implicitly defined by the weights, dramatically accelerating the
matching procedure. One way to produce such weights would be through an
initial regression of the outcome on all covariates, from which some
measure of covariate importance is computed and supplied to the matching
procedure. The underlying assumption would be that the pre-computed
covariate importances are good proxies for the accuracy of the
regressions computed on covariate subsets. This may not hold, for
example, in cases where variables are highly correlated.
Lastly, by default the estimate_CATEs
argument of
FLAME
and DAME
is set to FALSE
and CATEs are not estimated during the matching procedure. Afterwards,
when applying print
or summary
methods to the
ame
object returned by FLAME
or
DAME
, CATEs are estimated and used to estimate average
treatment effects. Typically, this takes negligible time; however, for
very large datasets, estimating CATEs while matching by setting
estimate_CATEs = TRUE
will greatly accelerate printing and
summarization of the resulting ame
object.
The current implementation of FLAME and DAME is not capable of handling datasets with millions of rows and dozens of covariates due to memory constraints. A future release of the package will include a database implementation of the algorithms, allowing for matching in datasets too large to fit in memory and for faster matching on very large datasets in general. More details, see (Wang et al. 2020, sec. 4.1).
Here we discuss additional considerations that may be taken into account when matching and highlight related package functionality along the way.
In the code we displayed above, matching was by default performed
without replacement, meaning that once a MMG is constructed for a unit,
it cannot be matched again. We can toggle this behavior using the
replace
argument to FLAME
and
DAME
:
FLAME_wo_replace <- FLAME_out # From previous run with replace = FALSE, as per the default
FLAME_w_replace <- FLAME(data, holdout, replace = TRUE)
#> FLAME stopping: predictive error would have risen 25% above the baseline.
Let’s start by comparing the main matched groups of an example unit, unit 32:
In this case, they are the same. Using the MG
function,
however, allows us to look at all the matched groups a unit is
part of by specifying multiple = TRUE
:
print(MG(32, FLAME_wo_replace)) # run with multiple = FALSE bc no replacement
#> [[1]]
#> X1 X2 X3 X4 X5 treated outcome
#> 32 3 3 1 1 3 1 25.22405
#> 242 3 3 1 1 3 0 19.88697
print(MG(32, FLAME_w_replace, multiple = TRUE))
#> [[1]]
#> [[1]][[1]]
#> X1 X2 X3 treated outcome
#> 5 3 3 1 0 19.49019
#> 12 3 3 1 0 20.91604
#> 32 3 3 1 1 25.22405
#> 90 3 3 1 0 21.28322
#> 94 3 3 1 0 21.45683
#> 111 3 3 1 1 24.36747
#> 118 3 3 1 0 20.92010
#> 131 3 3 1 0 21.15464
#> 242 3 3 1 0 19.88697
#> 271 3 3 1 1 24.81733
#> 357 3 3 1 0 20.75999
#> 366 3 3 1 0 20.60477
#> 370 3 3 1 0 18.50099
#> 388 3 3 1 0 21.40572
#> 403 3 3 1 0 20.59634
#>
#> [[1]][[2]]
#> X1 X2 X3 X5 treated outcome
#> 12 3 3 1 3 0 20.91604
#> 32 3 3 1 3 1 25.22405
#> 90 3 3 1 3 0 21.28322
#> 94 3 3 1 3 0 21.45683
#> 242 3 3 1 3 0 19.88697
#> 271 3 3 1 3 1 24.81733
#> 366 3 3 1 3 0 20.60477
#> 370 3 3 1 3 0 18.50099
#>
#> [[1]][[3]]
#> X1 X2 X3 X4 X5 treated outcome
#> 32 3 3 1 1 3 1 25.22405
#> 242 3 3 1 1 3 0 19.88697
Again, we see that the main matched groups (the entries corresponding
to matches on the greatest number of covariates) of each unit are the
same. However, running on the data matched with replacement yields two
more matched groups. We can see that whereas the units in the main
matched group agree on X1
through X5
, the
secondary matched groups of unit 32 are of lower quality in that units
only match on 3 or 4 covariates; they will thus be vulnerable by any
confounding bias due to differing values of the remaining
covariates.
Why then might we want to match with replacement? To see, let’s start
by getting the additional units (and not their associated data) in one
of the secondary matched groups of unit 32 from the matching with
replacement. We can do this by specifying id_only = TRUE
in
the call to MG
:
(unit32_MGs <- MG(32, FLAME_w_replace, multiple = TRUE, id_only = TRUE)[[1]])
#> [[1]]
#> [1] "5" "12" "32" "90" "94" "111" "118" "131" "242" "271" "357" "366"
#> [13] "370" "388" "403"
#>
#> [[2]]
#> [1] "12" "32" "90" "94" "242" "271" "366" "370"
#>
#> [[3]]
#> [1] "32" "242"
(only_in_secondary <- setdiff(unit32_MGs[[2]], unit32_MGs[[3]]))
#> [1] "12" "90" "94" "271" "366" "370"
Now let’s look at the matched groups formed for these units when we matched without replacement:
MG(only_in_secondary, FLAME_wo_replace)
#> [[1]]
#> NULL
#>
#> [[2]]
#> NULL
#>
#> [[3]]
#> X1 X2 X3 X4 X5 treated outcome
#> 94 3 3 1 2 3 0 21.45683
#> 271 3 3 1 2 3 1 24.81733
#> 370 3 3 1 2 3 0 18.50099
#>
#> [[4]]
#> X1 X2 X3 X4 X5 treated outcome
#> 94 3 3 1 2 3 0 21.45683
#> 271 3 3 1 2 3 1 24.81733
#> 370 3 3 1 2 3 0 18.50099
#>
#> [[5]]
#> NULL
#>
#> [[6]]
#> X1 X2 X3 X4 X5 treated outcome
#> 94 3 3 1 2 3 0 21.45683
#> 271 3 3 1 2 3 1 24.81733
#> 370 3 3 1 2 3 0 18.50099
We can see that when matching without replacement, only half of these units were matched! In this way they contribute nothing to our understanding of treatment effects in the data. Thus, while matching without replacement will only create matched groups of the highest possible quality for each unit, matching with replacement has the advantage of matching more units. There are also some differences as to the dependence induced between units that can affect how variance for treatment effect estimates is computed.
Note also that the MG
column in the data
entry of the resulting ame
object does not exist:
This is because when matching with replacement such a column is
insufficient to describe the matched groups, unlike in the
replace = FALSE
case.
As mentioned in the Introduction, FLAME drops covariate that lead to
the greatest match quality, defined: M
Q
:= C ⋅ B
F
− P
E
.
The hyperparameter C controls
the weight given to making many matches and making high quality matches
in the FLAME objective. In an extreme case, a value of C = 0 only makes matches that are as
high quality as possible; however, doing so may result in only very few
units being matched. On the other extreme, C tending to positive infinity will
push FLAME to make as many matches as possible. To illustrate:
# early_stop arguments for illustration only
(FLAME_C_small <-
FLAME(data, holdout, C = 0.01,
early_stop_iterations = 4, early_stop_epsilon = Inf, verbose = 0))
#> An object of class `ame`:
#> FLAME ran for 4 iterations, matching 469 out of 500 units without
#> replacement.
#> The average treatment effect of treatment `treated` on outcome `outcome` is
#> estimated to be 4.6762678.
(FLAME_C_big <-
FLAME(data, holdout, C = 10000,
early_stop_iterations = 4, early_stop_epsilon = Inf, verbose = 0))
#> An object of class `ame`:
#> FLAME ran for 4 iterations, matching 489 out of 500 units without
#> replacement.
#> The average treatment effect of treatment `treated` on outcome `outcome` is
#> estimated to be 5.9484195.
We see that higher C leads to more matches being made as that is the priority in the objective. However, the estimated ATE is around 5.9, compared to 4.7 from the small C setting. For this simulated data, the true ATE is 5; more matches came at the expense of lower quality and therefore greater bias in estimating treatment effects.
We have found C = 0.1 to be
a good default. However, because the matching procedure in and of itself
does not involve the outcomes of the matching data data
,
the output of FLAME
can be compared across multiple choices
of C
to get a better sense of the data’s dependence on
it.
By default, FLAME
and DAME
use ridge
regression with 5-fold cross validation to compute PE from the holdout
data. The other implemented option, selected by specifying
PE_method = "xgb"
uses gradient boosting, cross-validated
over a wide range of parameter values to do so. However, a user may want
to supply other functions, either for efficiency reasons or because they
believe they are better suited to outcome regression for their data.
This can be done by passing a function to the PE_method
argument. This function must accept two arguments – a data frame of
covariates and a vector of outcome values, in that order – and return a
vector of in-sample predictions which, if the outcome is binary or
multi-class, must be maximum probability class labels.
For example, we can compute PE using linear models as follows:
my_PE_linear <- function(X, Y) {
fit <- lm(Y ~ ., data = as.data.frame(cbind(X, Y = Y)))
return(fit$fitted.values)
}
FLAME_out <- FLAME(data = data, holdout = holdout, PE_method = my_PE_linear)
#> FLAME stopping: predictive error would have risen 25% above the baseline.
We emphasize that these functions must take in a data frame
of covariates as their first argument. Many R implementations of machine
learning algorithms require that covariates be passed in matrix form,
with factor variables binarized. In such cases, the function passed to
PE_method
should convert the data frame appropriately, as
with model.matrix
. Such an example, in which LASSO is used
to compute PE, is shown below:
my_PE_lasso <- function(X, Y) {
df <- as.data.frame(cbind(X, Y = Y))
X <- model.matrix(Y ~ ., data = df)
fit <- glmnet::cv.glmnet(X, Y, alpha = 1, nfolds = 5)
return(predict(fit, X))
}
FLAME_out <- FLAME(data = data, holdout = holdout, PE_method = my_PE_lasso)
#> FLAME stopping: predictive error would have risen 25% above the baseline.
If the outcome is binary or multiclass, the supplied function must
return the maximum probability class instead of, for example, class
probabilities. As an example, we use LASSO-penalized logistic regression
with the outcome coded as binary 0-1 and ensure the predict
method is called with argument type = 'class'
.
# Recode the outcome to numeric 0 1
data$outcome <- ifelse(data$outcome > median(data$outcome), 1, 0)
holdout$outcome <- ifelse(holdout$outcome > median(holdout$outcome), 1, 0)
my_PE_lr <- function(X, Y) {
df <- as.data.frame(cbind(X, Y = Y))
X <- model.matrix(Y ~ ., data = df)
fit <- glmnet::cv.glmnet(X, Y, alpha = 1, nfolds = 5, family = 'binomial')
return(predict(fit, X, type = 'class'))
}
FLAME_out <- FLAME(data = data, holdout = holdout, PE_method = my_PE_lr)
#> FLAME stopping: predictive error would have risen 25% above the baseline.
The same should be done if the outcome is coded as a factor or if the outcome has more than 2 classes.
By default, FLAME and DAME terminate when all covariates have been dropped or all control / treatment units have been matched. But there are various early stopping arguments that can be supplied to alter this behavior. In all cases, however, the algorithms still terminate if all covariates have been dropped or all control / treatment units have been matched, even if the user-specified stopping condition has not yet been met.
We can set a maximum number of iterations to run for via the
early_stop_iterations
argument. A value of 1, for example,
corresponds to a single round of exact matching on all covariates:
(FLAME_exact <- FLAME(data, holdout, early_stop_iterations = 1))
#> FLAME stopping: completed 1 iterations
#> An object of class `ame`:
#> FLAME ran for 1 iterations, matching 192 out of 500 units without
#> replacement.
FLAME_exact$cov_sets # No covariate sets are dropped
#> [[1]]
#> NULL
The early_stop_epsilon
argument is important for
ensuring that matches are not made on sets of covariates too poor to
predict the outcome well. If the matching algorithm attempts to drop a
covariate that would raise the PE above (1 + early_stop_epsilon) times
the baseline PE (the PE before any covariates have been dropped), it
will terminate beforehand. It defaults to 0.25, which we have found to
be a reasonable default, though we recommend performing the matching
with multiple values of this argument. The predictive error at each
iteration can be returned by specifying return_pe = TRUE
as
well. If a large spike is seen, relative to the scale of the outcome, a
lower value of early_stop_epsilon
can be set and those
matches evaluated.
Missingness is common in large, modern datasets; FLAME
and DAME
offer several options for how to handle missing
values. Let’s start by adding some missingness to our data:
covs <- as.matrix(data[, 1:p])
inds <- sample(1:(n * p), size = round(0.05 * n * p), replace = FALSE)
covs[inds] <- NA
data[, 1:p] <- covs
The missing_data
argument governs which of 3 methods
should be used to handle missingness in the data to be matched. If
missing_data = 'drop'
, units with any missing data
are effectively dropped from the data and not matched:
FLAME_drop <- FLAME(data, holdout, missing_data = 'drop')
#> FLAME stopping: predictive error would have risen 25% above the baseline.
units_with_missingness <- which(apply(covs, 1, function(x) any(is.na(x))))
all(!FLAME_drop$data$matched[units_with_missingness])
#> [1] TRUE
We can see that any units with missingness are unmatched.
If missing_data = 'keep'
, the missing values are
kept in the data; in this case, a unit can only match on sets
containing covariates it is not missing. In a previous run that
generated FLAME_out
, before we introduced missingness, unit
57 matched on all covariates:
(57 %in% units_with_missingness) # an example unit
#> [1] FALSE
MG(57, FLAME_out)
#> [[1]]
#> X1 X2 X3 X4 X5 treated outcome
#> 20 1 2 2 2 3 0 0
#> 57 1 2 2 2 3 1 0
Running with missing_data = 'keep'
, we are still able to
make matches on this unit, however only on a smaller subset of
covariates which necessarily excludes any that unit 57 was missing:
DAME_keep <- DAME(data, holdout, missing_data = 'keep', verbose = 0)
print(data[DAME_keep$MGs[[57]], ])
#> X1 X2 X3 X4 X5 outcome treated
#> 20 1 2 2 2 3 0 0
#> 57 1 2 2 2 3 0 1
print(MG(57, DAME_keep))
#> [[1]]
#> X1 X2 X3 X4 X5 treated outcome
#> 20 1 2 2 2 3 0 0
#> 57 1 2 2 2 3 1 0
Lastly, running with missing_data = 'impute'
, missing
values are imputed via MICE (mice::mice
).
DAME_impute <- DAME(data, holdout, missing_data = 'impute', verbose = 0)
#> Starting imputation of `data`Finished imputation of `data`
print(DAME_impute$data[57, ])
#> X1 X2 X3 X4 X5 outcome treated matched weight MG
#> 57 1 2 2 2 3 0 1 TRUE 1 7
print(MG(57, DAME_impute))
#> [[1]]
#> X1 X2 X3 X4 X5 treated outcome
#> 20 1 2 2 2 3 0 0
#> 57 1 2 2 2 3 1 0
The unit has had a value imputed for X5
and now can make
matches on that covariate. In practice, it might be good to impute
missing data multiple times and look at the range of matches made across
imputations.
For missingness in the holdout data, ‘drop’ and ‘impute’ flags can
also be supplied to missing_holdout
, where they ignore rows
with missingness when computing PE and use imputed values to do so,
respectively. The missing_holdout_imputations
argument also
controls how many imputations should be performed if
missing_holdout = 'impute'
; the predictive error at an
iteration will be computed as the average across all iterations.
In this package, we present implementations of the DAME and FLAME algorithms for causal inference on discrete observational data. The algorithms are efficient and create high-quality, interpretable matches that can be used to estimate treatment effects, with many options for controlling how matches are to be formed, and tools for post-matching analysis. Future work will consist of a database implementation of these algorithms, allowing for analysis of datasets too large to fit in memory, treatment effect estimation for binary outcomes, the ability to parallelize PE computation, and more. Additional information on Almost Matching Exactly, including algorithms for mixed data, academic papers, and introductory videos, can be found on the Almost Matching Exactly website.