Title: | Panel Treatment Effects Tools |
---|---|
Description: | Generic code for estimating treatment effects with panel data. The idea is to break into separate steps organizing the data, looping over groups and time periods, computing group-time average treatment effects, and aggregating group-time average treatment effects. Often, one is able to implement a new identification/estimation procedure by simply replacing the step on estimating group-time average treatment effects. See several different examples of this approach in the package documentation. |
Authors: | Brantly Callaway [aut, cre] |
Maintainer: | Brantly Callaway <[email protected]> |
License: | GPL-3 |
Version: | 1.0.0 |
Built: | 2025-02-14 05:28:46 UTC |
Source: | https://github.com/bcallaway11/ptetools |
Objects of this class hold results on aggregated
group-time average treatment effects. This is derived from the AGGTEobj
class in the did
package.
An object for holding aggregated treatment effect parameters.
aggte_obj( overall.att = NULL, overall.se = NULL, type = "simple", egt = NULL, att.egt = NULL, se.egt = NULL, crit.val.egt = NULL, inf.function = NULL, min_e = NULL, max_e = NULL, balance_e = NULL, DIDparams = NULL )
aggte_obj( overall.att = NULL, overall.se = NULL, type = "simple", egt = NULL, att.egt = NULL, se.egt = NULL, crit.val.egt = NULL, inf.function = NULL, min_e = NULL, max_e = NULL, balance_e = NULL, DIDparams = NULL )
overall.att |
The estimated overall ATT |
overall.se |
Standard error for overall ATT |
type |
The type of aggregation to be done. Default is "overall". |
egt |
Holds the length of exposure (for dynamic effects), the group (for selective treatment timing), or the time period (for calendar time effects) |
att.egt |
The ATT specific to egt |
se.egt |
The standard error specific to egt |
crit.val.egt |
A critical value for computing uniform confidence bands for dynamic effects, selective treatment timing, or time period effects. |
inf.function |
The influence function of the chosen aggregated parameters |
min_e |
The minimum event time computed in the event study results. This is useful when there are a huge number of pre-treatment periods. |
max_e |
The maximum event time computed in the event study results. This is useful when there are a huge number of post-treatment periods. |
balance_e |
Drops groups that do not have at least |
DIDparams |
A DIDparams object |
an aggte_obj
Class for holding group-time average treatment effects along with their influence function
attgt_if(attgt, inf_func, extra_gt_returns = NULL)
attgt_if(attgt, inf_func, extra_gt_returns = NULL)
attgt |
group-time average treatment effect |
inf_func |
influence function |
extra_gt_returns |
A place to return anything extra from particular group-time average treatment effect calculations. For DID, this might be something like propensity score estimates, regressions of untreated potential outcomes on covariates. For ife, this could be something like the first step regression 2sls estimates. This argument is also potentially useful for debugging. |
attgt_if
object
Class for holding returns from group-time specific estimates in settings when an influence function is not returned
attgt_noif(attgt, extra_gt_returns = NULL)
attgt_noif(attgt, extra_gt_returns = NULL)
attgt |
group-time average treatment effect |
extra_gt_returns |
A place to return anything extra from particular group-time average treatment effect calculations. For DID, this might be something like propensity score estimates, regressions of untreated potential outcomes on covariates. For ife, this could be something like the first step regression 2sls estimates. This argument is also potentially useful for debugging. |
an attgt_noif
object
Aggregate group-time average treatment effects into overall, group, and dynamic effects. This function is only used for (i) computing standard errors using the empirical bootstrap, and (ii) combining distributions at the (g,t) level
attgt_pte_aggregations(attgt.list, ptep)
attgt_pte_aggregations(attgt.list, ptep)
attgt.list |
list of attgt results from |
ptep |
|
pte_emp_boot
object
Function that actually computes panel treatment effects.
The difference relative to compute.pte
is that this function
loops over time periods first (instead of groups) and tries to
estimate model for untreated potential outcomes jointly for all groups.
compute.pte(ptep, subset_fun, attgt_fun, ...)
compute.pte(ptep, subset_fun, attgt_fun, ...)
ptep |
|
subset_fun |
This is a function that should take in |
attgt_fun |
This is a function that should work in the case where there is a single group and the "right" number of time periods to recover an estimate of the ATT. For example, in the contest of difference in differences, it would need to work for a single group, find the appropriate comparison group (untreated units), find the right time periods (pre- and post-treatment), and then recover an estimate of ATT for that group. It will be called over and over separately by groups and by time periods to compute ATT(g,t)'s. The function needs to work in a very specific way. It should take in the
arguments: If |
... |
extra arguments that can be passed to create the correct subsets
of the data (depending on |
a list containing the following elements:
attgt.list
: list of ATT(g,t) estimates
inffunc
: influence function matrix
extra_gt_returns
: list of extra returns from gt-specific calculationsons
A function to perform sanity checks and possibly adjust a a critical value to form a uniform confidence band
crit_val_checks(crit_val, alp = 0.05)
crit_val_checks(crit_val, alp = 0.05)
crit_val |
the critical value |
alp |
the significance level |
a (possibly adjusted) critical value
Takes a data.frame and computes for a particular group g and time period t and computes an estimate of a group time average treatment effect and a corresponding influence function using a difference in differences approach.
The code relies on gt_data
having certain variables defined.
In particular, there should be an id
column (individual identifier),
D
(treated group identifier), period
(time period), name
(equal to "pre" for pre-treatment periods and equal to "post" for post
treatment periods), Y
(outcome).
In our case, we call two_by_two_subset
which sets up the
data to have this format before the call to did_attgt
.
did_attgt(gt_data, xformula = ~1, ...)
did_attgt(gt_data, xformula = ~1, ...)
gt_data |
data that is "local" to a particular group-time average treatment effect |
xformula |
one-sided formula for covariates used in the propensity score and outcome regression models |
... |
extra function arguments; not used here |
attgt_if
Holds results from computing dose-specific treatment effects with a continuous treatment
dose_obj( dose, overall_att = NULL, overall_att_se = NULL, overall_att_inffunc = NULL, overall_acrt = NULL, overall_acrt_se = NULL, overall_acrt_inffunc = NULL, att.d = NULL, att.d_se = NULL, att.d_crit.val = NULL, att.d_inffunc = NULL, acrt.d = NULL, acrt.d_se = NULL, acrt.d_crit.val = NULL, acrt.d_inffunc = NULL, pte_params = NULL )
dose_obj( dose, overall_att = NULL, overall_att_se = NULL, overall_att_inffunc = NULL, overall_acrt = NULL, overall_acrt_se = NULL, overall_acrt_inffunc = NULL, att.d = NULL, att.d_se = NULL, att.d_crit.val = NULL, att.d_inffunc = NULL, acrt.d = NULL, acrt.d_se = NULL, acrt.d_crit.val = NULL, acrt.d_inffunc = NULL, pte_params = NULL )
dose |
vector containing the values of the dose used in estimation |
overall_att |
estimate of the overall ATT, the mean of ATT(D) given D > 0 |
overall_att_se |
the standard error of the estimate of overall_att |
overall_att_inffunc |
the influence function for estimating overall_att |
overall_acrt |
estimate of the overall ACRT, the mean of ACRT(D|D) given D > 0 |
overall_acrt_se |
the standard error for the estimate of overall_acrt |
overall_acrt_inffunc |
the influence function for estimating overall_acrt |
att.d |
estimates of ATT(d) for each value of |
att.d_se |
standard error of ATT(d) for each value of |
att.d_crit.val |
critical value to produce pointwise or uniform confidence interval for ATT(d) |
att.d_inffunc |
matrix containing the influence function from estimating ATT(d) |
acrt.d |
estimates of ACRT(d) for each value of |
acrt.d_se |
standard error of ACRT(d) for each value of |
acrt.d_crit.val |
critical value to produce pointwise or uniform confidence interval for ACRT(d) |
acrt.d_inffunc |
matrix containing the influence function from estimating ACRT(d) |
pte_params |
a pte_params object containing other parameters passed to the function |
a dose_obj
object
The main plotting function in the ptetools
package. It plots
event studies. This
function is generic enough that most packages that otherwise use
the ptetools
package can call it directly to plot an event study.
ggpte(pte_results)
ggpte(pte_results)
pte_results |
A |
A ggplot object
Plots dose-specific results in applications with a continuous treatment
ggpte_cont(dose_obj, type = "att")
ggpte_cont(dose_obj, type = "att")
dose_obj |
a |
type |
whether to plot ATT(d) or ACRT(d), defaults to |
A ggplot object
Class that holds causal effect parameter estimates across timing groups and time periods
group_time_att( group, time.period, att, V_analytical, se, crit_val, inf_func, n, W, Wpval, cband, alp, ptep, extra_gt_returns )
group_time_att( group, time.period, att, V_analytical, se, crit_val, inf_func, n, W, Wpval, cband, alp, ptep, extra_gt_returns )
group |
numeric vector of groups for ATT(g,t) |
time.period |
numeric vector of time periods for ATT(g,t) |
att |
numeric vector containing the value of ATT(g,t) for corresponding group and time period |
V_analytical |
analytical asymptotic variance matrix for ATT(g,t)'s |
se |
numeric vector of standard errors |
crit_val |
critical value (usually a critical value for conducting uniform inference) |
inf_func |
matrix of influence function |
n |
number of unique individuals |
W |
Wald statistic for ATT(g,t) version of pre-test of parallel trends assumption |
Wpval |
p-value for Wald pre-test of ATT(g,t) version of parallel trends assumption |
cband |
logical indicating whether or not to report a confidence band |
alp |
significance level |
ptep |
|
extra_gt_returns |
list containing extra returns at the group-time level |
object of class group_time_att
Checks and converts data to satisfy criteria to be used in internal
ptetools
functions. In particular,
the function takes in a data.frame, checks if it has the right
columns to be used to calculate a group-time average treatment effect,
and sets the class of the data.frame to include gt_data_frame
gt_data_frame(data)
gt_data_frame(data)
data |
data that will be checked to see if has right format for computing group-time average treatment effects |
gt_data_frame
object
A function that takes an original data set and keeps all
data for all groups that are not-yet-treated by period tp
as well
as for group g
.
In particular, this keeps more data than functions like two_by_two
subset that use a fixed base period.
A main use case for this function is the interactive fixed effects approach proposed in Callaway and Tsyawo (2023).
keep_all_pretreatment_subset(data, g, tp, ...)
keep_all_pretreatment_subset(data, g, tp, ...)
data |
the full dataset |
g |
the current group |
tp |
the current time period |
... |
additional arguments |
list that contains the following elements:
gt_data
: a gt_data_frame
object that contains the
correct subset of data
n1
: the number of observations in this subset
disidx
: a vector of the correct ids for this subset
A function that takes an original data set and keeps all pre-treatment data for all groups. For group g, it also includes data for the current period.
Also, note that if tp
is still a pre-treatment period for group g,
then periods after tp
will also be dropped for group g. This is a
design choice and is useful especially for estimating placebo
group-time average treatment effects in pre-treatment periods.
A main use case for this function is to compute ATT(g,t)'s using a global estimation strategy such as imputation in Gardner (2022).
keep_all_untreated_subset(data, g, tp, ...)
keep_all_untreated_subset(data, g, tp, ...)
data |
the full dataset |
g |
the current group |
tp |
the current time period |
... |
extra arguments to get the subset correct |
list that contains the following elements:
gt_data
: a gt_data_frame
object that contains the
correct subset of data
n1
: the number of observations in this subset
disidx
: a vector of the correct ids for this subset
Function for using multiplier bootstrap to conduct inference
mboot2(inffunc, biters = 1000, alp = 0.05)
mboot2(inffunc, biters = 1000, alp = 0.05)
inffunc |
influence function matrix |
biters |
number of bootstrap iterations; default is 100 |
alp |
significance level; default is 0.05 |
list with the following elements:
boot_se
: bootstrap standard errors
crit_val
: critical value for uniform confidence bands
A function that returns weights on (g,t)'s to deliver overall (averaged across groups and time periods) treatment effect parameters
overall_weights(attgt, balance_e = NULL, min_e = -Inf, max_e = Inf, ...)
overall_weights(attgt, balance_e = NULL, min_e = -Inf, max_e = Inf, ...)
attgt |
A group_time_att object to be aggregated |
balance_e |
Drops groups that do not have at least |
min_e |
The minimum event time computed in the event study results. This is useful when there are a huge number of pre-treatment periods. |
max_e |
The maximum event time computed in the event study results. This is useful when there are a huge number of post-treatment periods. |
... |
extra arguments |
a data.frame containing columns:
group: the group
time.period: the time period
overall_weight: the weight
Computes empirical bootstrap pointwise standard errors
panel_empirical_bootstrap( attgt.list, ptep, setup_pte_fun, subset_fun, attgt_fun, extra_gt_returns, ... )
panel_empirical_bootstrap( attgt.list, ptep, setup_pte_fun, subset_fun, attgt_fun, extra_gt_returns, ... )
attgt.list |
list of attgt results from |
ptep |
|
setup_pte_fun |
This is a function that should take in This function provides also provides a good place for error handling related to the types of data that can be handled. The |
subset_fun |
This is a function that should take in |
attgt_fun |
This is a function that should work in the case where there is a single group and the "right" number of time periods to recover an estimate of the ATT. For example, in the contest of difference in differences, it would need to work for a single group, find the appropriate comparison group (untreated units), find the right time periods (pre- and post-treatment), and then recover an estimate of ATT for that group. It will be called over and over separately by groups and by time periods to compute ATT(g,t)'s. The function needs to work in a very specific way. It should take in the
arguments: If |
extra_gt_returns |
A place to return anything extra from particular group-time average treatment effect calculations. For DID, this might be something like propensity score estimates, regressions of untreated potential outcomes on covariates. For ife, this could be something like the first step regression 2sls estimates. This argument is also potentially useful for debugging. |
... |
extra arguments that can be passed to create the correct subsets
of the data (depending on |
pte_emp_boot
object
Process ATT(g,t) results when influence function is available
process_att_gt(att_gt_results, ptep)
process_att_gt(att_gt_results, ptep)
att_gt_results |
ATT(g,t)'s |
ptep |
|
group_time_att
object
After computing results for each group and time period,
process_dose_gt
combines/averages them into overall effects and/or
dose specific effects. This is generic code that can be used
from different ways of estimating causal effects across different
timing groups and periods in a previous step.
process_dose_gt(gt_results, ptep, ...)
process_dose_gt(gt_results, ptep, ...)
gt_results |
list of group-time specific results |
ptep |
|
... |
extra arguments |
a dose_obj
object
Tools for estimating treatment effects with panel data.
Main function for computing panel treatment effects
pte( yname, gname, tname, idname, data, setup_pte_fun, subset_fun, attgt_fun, cband = TRUE, alp = 0.05, boot_type = "multiplier", weightsname = NULL, gt_type = "att", ret_quantile = NULL, global_fun = FALSE, time_period_fun = FALSE, group_fun = FALSE, process_dtt_gt_fun = process_dtt_gt, process_dose_gt_fun = process_dose_gt, biters = 100, cl = 1, call = NULL, ... )
pte( yname, gname, tname, idname, data, setup_pte_fun, subset_fun, attgt_fun, cband = TRUE, alp = 0.05, boot_type = "multiplier", weightsname = NULL, gt_type = "att", ret_quantile = NULL, global_fun = FALSE, time_period_fun = FALSE, group_fun = FALSE, process_dtt_gt_fun = process_dtt_gt, process_dose_gt_fun = process_dose_gt, biters = 100, cl = 1, call = NULL, ... )
yname |
Name of outcome in |
gname |
Name of group in |
tname |
Name of time period in |
idname |
Name of id in |
data |
balanced panel data |
setup_pte_fun |
This is a function that should take in This function provides also provides a good place for error handling related to the types of data that can be handled. The |
subset_fun |
This is a function that should take in |
attgt_fun |
This is a function that should work in the case where there is a single group and the "right" number of time periods to recover an estimate of the ATT. For example, in the contest of difference in differences, it would need to work for a single group, find the appropriate comparison group (untreated units), find the right time periods (pre- and post-treatment), and then recover an estimate of ATT for that group. It will be called over and over separately by groups and by time periods to compute ATT(g,t)'s. The function needs to work in a very specific way. It should take in the
arguments: If |
cband |
whether or not to report a uniform (instead of pointwise) confidence band (default is TRUE) |
alp |
significance level; default is 0.05 |
boot_type |
should be one of "multiplier" (the default) or "empirical".
The multiplier bootstrap is generally much faster, but |
weightsname |
The name of the column that contains sampling weights. The default is NULL, in which case no sampling weights are used. |
gt_type |
which type of group-time effects are computed.
The default is "att". Different estimation strategies can implement
their own choices for |
ret_quantile |
For functions that compute quantile treatment effects,
this is a specific quantile at which to report results, e.g.,
|
global_fun |
Logical indicating whether or not untreated potential outcomes can be estimated in one shot, i.e., for all groups and time periods. Main use case would be for one-shot imputation estimators. Not supported yet. |
time_period_fun |
Logical indicating whether or not untreated potential outcomes can be estimated for all groups in the same time period. Not supported yet. |
group_fun |
Logical indicating whether or not untreated potential outcomes can be estimated for all time periods for a single group. Not supported yet. These functions aim at reducing or eliminating running the same code multiple times. |
process_dtt_gt_fun |
An optional function to customize results when
the gt-specific function returns the distribution of treated and untreated
potential outcomes. The default is |
process_dose_gt_fun |
An optional function to customize results when the gt-specific
function returns treatment effects that depend on dose (i.e., amount of the
treatment). The default is |
biters |
number of bootstrap iterations; default is 100 |
cl |
number of clusters to be used when bootstrapping; default is 1 |
call |
keeps track of through the |
... |
extra arguments that can be passed to create the correct subsets
of the data (depending on |
pte_results
object
Maintainer: Brantly Callaway [email protected]
Useful links:
Report bugs at https://github.com/bcallaway11/ptetools/issues
# example using minimum wage data # and difference-in-differences identification strategy library(did) data(mpdta) did_res <- pte( yname = "lemp", gname = "first.treat", tname = "year", idname = "countyreal", data = mpdta, setup_pte_fun = setup_pte, subset_fun = two_by_two_subset, attgt_fun = did_attgt, xformla = ~lpop ) summary(did_res) ggpte(did_res)
# example using minimum wage data # and difference-in-differences identification strategy library(did) data(mpdta) did_res <- pte( yname = "lemp", gname = "first.treat", tname = "year", idname = "countyreal", data = mpdta, setup_pte_fun = setup_pte, subset_fun = two_by_two_subset, attgt_fun = did_attgt, xformla = ~lpop ) summary(did_res) ggpte(did_res)
This is a slight edit of the aggte function from the did
package.
Currently, it only provides aggregations for "overall" treatment effects
and event studies. It also will provide the weights directly which is
currently used for constructing aggregations based on distributions.
The other difference is that, pte_aggte
provides inference results
where the only randomness is coming from the outcomes (not from the group
assignment nor from the covariates).
pte_aggte( attgt, type = "overall", balance_e = NULL, min_e = -Inf, max_e = Inf, ... )
pte_aggte( attgt, type = "overall", balance_e = NULL, min_e = -Inf, max_e = Inf, ... )
attgt |
A group_time_att object to be aggregated |
type |
The type of aggregation to be done. Default is "overall". |
balance_e |
Drops groups that do not have at least |
min_e |
The minimum event time computed in the event study results. This is useful when there are a huge number of pre-treatment periods. |
max_e |
The maximum event time computed in the event study results. This is useful when there are a huge number of post-treatment periods. |
... |
extra arguments |
an aggte_obj
pte_attgt
takes a "local" data.frame and computes
an estimate of a group time average treatment effect
and a corresponding influence function. This function generalizes
a number of existing methods and underlies the pte_default
function.
The code relies on gt_data
having certain variables defined.
In particular, there should be an id
column (individual identifier),
G
(group identifier), period
(time period), name
(equal to "pre" for pre-treatment periods and equal to "post" for post
treatment periods), Y
(outcome).
In our case, we call two_by_two_subset
which sets up the
data to have this format before the call to pte_attgt
pte_attgt( gt_data, xformula, d_outcome = FALSE, d_covs_formula = ~-1, lagged_outcome_cov = FALSE, est_method = "dr", ... )
pte_attgt( gt_data, xformula, d_outcome = FALSE, d_covs_formula = ~-1, lagged_outcome_cov = FALSE, est_method = "dr", ... )
gt_data |
data that is "local" to a particular group-time average treatment effect |
xformula |
one-sided formula for covariates used in the propensity score and outcome regression models |
d_outcome |
Whether or not to take the first difference of the outcome. The default is FALSE. To use difference-in-differences, set this to be TRUE. |
d_covs_formula |
A formula for time varying covariates to enter the first estimation step models. The default is not to include any, and, hence, to only include pre-treatment covariates. |
lagged_outcome_cov |
Whether to include the lagged outcome as a covariate. Default is FALSE. |
est_method |
Which type of estimation method to use. Default is "dr" for doubly robust. The other option is "reg" for regression adjustment. |
... |
extra function arguments; not used here |
attgt_if
This is a generic/example wrapper for a call to the pte
function.
This function provides access to difference-in-differences and unconfoundedness based identification/estimation strategies given (i) panel data and (ii) staggered treatment adoption
pte_default( yname, gname, tname, idname, data, xformula = ~1, d_outcome = FALSE, d_covs_formula = ~-1, lagged_outcome_cov = FALSE, est_method = "dr", anticipation = 0, base_period = "varying", control_group = "notyettreated", weightsname = NULL, cband = TRUE, alp = 0.05, boot_type = "multiplier", biters = 100, cl = 1 )
pte_default( yname, gname, tname, idname, data, xformula = ~1, d_outcome = FALSE, d_covs_formula = ~-1, lagged_outcome_cov = FALSE, est_method = "dr", anticipation = 0, base_period = "varying", control_group = "notyettreated", weightsname = NULL, cband = TRUE, alp = 0.05, boot_type = "multiplier", biters = 100, cl = 1 )
yname |
Name of outcome in |
gname |
Name of group in |
tname |
Name of time period in |
idname |
Name of id in |
data |
balanced panel data |
xformula |
one-sided formula for covariates used in the propensity score and outcome regression models |
d_outcome |
Whether or not to take the first difference of the outcome. The default is FALSE. To use difference-in-differences, set this to be TRUE. |
d_covs_formula |
A formula for time varying covariates to enter the first estimation step models. The default is not to include any, and, hence, to only include pre-treatment covariates. |
lagged_outcome_cov |
Whether to include the lagged outcome as a covariate. Default is FALSE. |
est_method |
Which type of estimation method to use. Default is "dr" for doubly robust. The other option is "reg" for regression adjustment. |
anticipation |
how many periods before the treatment actually takes place that it can have an effect on outcomes |
base_period |
The type of base period to use. This only affects the numeric value of results in pre-treatment periods. Results in post-treatment periods are not affected by this choice. The default is "varying", where the base period will "back up" to the immediately preceding period in pre-treatment periods. The other option is "universal" where the base period is fixed in pre-treatment periods to be the period right before the treatment starts. "Universal" is commonly used in difference-in-differences applications, but can be unnatural for other identification strategies. |
control_group |
Which group is used as the comparison group. The default choice is "notyettreated", but different estimation strategies can implement their own choices for the control group |
weightsname |
The name of the column that contains sampling weights. The default is NULL, in which case no sampling weights are used. |
cband |
whether or not to report a uniform (instead of pointwise) confidence band (default is TRUE) |
alp |
significance level; default is 0.05 |
boot_type |
should be one of "multiplier" (the default) or "empirical".
The multiplier bootstrap is generally much faster, but |
biters |
number of bootstrap iterations; default is 100 |
cl |
number of clusters to be used when bootstrapping; default is 1 |
pte_results
object
# example using minimum wage data # and a lagged outcome unconfoundedness strategy library(did) data(mpdta) lou_res <- pte_default( yname = "lemp", gname = "first.treat", tname = "year", idname = "countyreal", data = mpdta, xformula = ~lpop, d_outcome = FALSE, d_covs_formula = ~lpop, lagged_outcome_cov = TRUE ) summary(lou_res) ggpte(lou_res)
# example using minimum wage data # and a lagged outcome unconfoundedness strategy library(did) data(mpdta) lou_res <- pte_default( yname = "lemp", gname = "first.treat", tname = "year", idname = "countyreal", data = mpdta, xformula = ~lpop, d_outcome = FALSE, d_covs_formula = ~lpop, lagged_outcome_cov = TRUE ) summary(lou_res) ggpte(lou_res)
Class for holding results with a continuous treatment
pte_dose_results(att_gt, dose, att_d = NULL, acrt_d = NULL, ptep)
pte_dose_results(att_gt, dose, att_d = NULL, acrt_d = NULL, ptep)
att_gt |
attgt results |
dose |
vector of doses |
att_d |
ATT(d) for each value of |
acrt_d |
ACRT(d) for each value of |
ptep |
a |
a pte_dose_results
object
Class for holding ptetools
empirical bootstrap results
pte_emp_boot( attgt_results, overall_results, group_results, dyn_results, overall_weights = NULL, dyn_weights = NULL, group_weights = NULL, extra_gt_returns = NULL )
pte_emp_boot( attgt_results, overall_results, group_results, dyn_results, overall_weights = NULL, dyn_weights = NULL, group_weights = NULL, extra_gt_returns = NULL )
attgt_results |
|
overall_results |
|
group_results |
|
dyn_results |
|
overall_weights |
vector containing weights on underlying ATT(g,t) for overall treatment effect parameter |
dyn_weights |
list containing weights on underlying ATT(g,t)
for each value of |
group_weights |
list containing weights on underlying ATT(g,t) corresponding to deliver averaged group-specific treatment effects |
extra_gt_returns |
A place to return anything extra from particular group-time average treatment effect calculations. For DID, this might be something like propensity score estimates, regressions of untreated potential outcomes on covariates. For ife, this could be something like the first step regression 2sls estimates. This argument is also potentially useful for debugging. |
a pte_emp_boot
object
Class that contains pte parameters
pte_params( yname, gname, tname, idname, data, glist, tlist, cband, alp, boot_type, anticipation = NULL, base_period = NULL, weightsname = NULL, control_group = "notyettreated", gt_type = "att", ret_quantile = 0.5, global_fun = FALSE, time_period_fun = FALSE, group_fun = FALSE, biters, cl, call = NULL )
pte_params( yname, gname, tname, idname, data, glist, tlist, cband, alp, boot_type, anticipation = NULL, base_period = NULL, weightsname = NULL, control_group = "notyettreated", gt_type = "att", ret_quantile = 0.5, global_fun = FALSE, time_period_fun = FALSE, group_fun = FALSE, biters, cl, call = NULL )
yname |
Name of outcome in |
gname |
Name of group in |
tname |
Name of time period in |
idname |
Name of id in |
data |
balanced panel data |
glist |
list of groups to create group-time average treatment effects for |
tlist |
list of time periods to create group-time average treatment effects for |
cband |
whether or not to report a uniform (instead of pointwise) confidence band (default is TRUE) |
alp |
significance level; default is 0.05 |
boot_type |
which type of bootstrap to use |
anticipation |
how many periods before the treatment actually takes place that it can have an effect on outcomes |
base_period |
The type of base period to use. This only affects the numeric value of results in pre-treatment periods. Results in post-treatment periods are not affected by this choice. The default is "varying", where the base period will "back up" to the immediately preceding period in pre-treatment periods. The other option is "universal" where the base period is fixed in pre-treatment periods to be the period right before the treatment starts. "Universal" is commonly used in difference-in-differences applications, but can be unnatural for other identification strategies. |
weightsname |
The name of the column that contains sampling weights. The default is NULL, in which case no sampling weights are used. |
control_group |
Which group is used as the comparison group. The default choice is "notyettreated", but different estimation strategies can implement their own choices for the control group |
gt_type |
which type of group-time effects are computed.
The default is "att". Different estimation strategies can implement
their own choices for |
ret_quantile |
For functions that compute quantile treatment effects,
this is a specific quantile at which to report results, e.g.,
|
global_fun |
Logical indicating whether or not untreated potential outcomes can be estimated in one shot, i.e., for all groups and time periods. Main use case would be for one-shot imputation estimators. Not supported yet. |
time_period_fun |
Logical indicating whether or not untreated potential outcomes can be estimated for all groups in the same time period. Not supported yet. |
group_fun |
Logical indicating whether or not untreated potential outcomes can be estimated for all time periods for a single group. Not supported yet. These functions aim at reducing or eliminating running the same code multiple times. |
biters |
number of bootstrap iterations; default is 100 |
cl |
number of clusters to be used when bootstrapping; default is 1 |
call |
keeps track of through the |
pte_params
object
Class for holding overall results with a staggered treatment, including an overall ATT and an event study
pte_results(att_gt, overall_att, event_study, ptep)
pte_results(att_gt, overall_att, event_study, ptep)
att_gt |
attgt results |
overall_att |
overall_att results |
event_study |
event_study results |
ptep |
|
a pte_results
object
Aggregate group-time distribution of the treatment effect into overall, group, and dynamic effects.
qott_pte_aggregations(attgt.list, ptep, extra_gt_returns)
qott_pte_aggregations(attgt.list, ptep, extra_gt_returns)
attgt.list |
list of attgt results from |
ptep |
|
extra_gt_returns |
A place to return anything extra from particular group-time average treatment effect calculations. For DID, this might be something like propensity score estimates, regressions of untreated potential outcomes on covariates. For ife, this could be something like the first step regression 2sls estimates. This argument is also potentially useful for debugging. |
pte_emp_boot
object
Aggregate group-time distributions into qtt versions of overall, group, and dynamic effects.
qtt_pte_aggregations(attgt.list, ptep, extra_gt_returns)
qtt_pte_aggregations(attgt.list, ptep, extra_gt_returns)
attgt.list |
list of attgt results from |
ptep |
|
extra_gt_returns |
A place to return anything extra from particular group-time average treatment effect calculations. For DID, this might be something like propensity score estimates, regressions of untreated potential outcomes on covariates. For ife, this could be something like the first step regression 2sls estimates. This argument is also potentially useful for debugging. |
pte_emp_boot
object
This is a function for how to setup
the data to be used in the ptetools
package.
The setup_pte
function builds on setup_pte_basic
and
attempts to provide a general purpose function (with error handling)
to arrange the data in a way that can be processed by subset_fun
and attgt_fun
in the next steps.
setup_pte( yname, gname, tname, idname, data, required_pre_periods = 1, anticipation = 0, base_period = "varying", cband = TRUE, alp = 0.05, boot_type = "multiplier", weightsname = NULL, gt_type = "att", ret_quantile = 0.5, biters = 100, cl = 1, call = NULL, ... )
setup_pte( yname, gname, tname, idname, data, required_pre_periods = 1, anticipation = 0, base_period = "varying", cband = TRUE, alp = 0.05, boot_type = "multiplier", weightsname = NULL, gt_type = "att", ret_quantile = 0.5, biters = 100, cl = 1, call = NULL, ... )
yname |
Name of outcome in |
gname |
Name of group in |
tname |
Name of time period in |
idname |
Name of id in |
data |
balanced panel data |
required_pre_periods |
The number of required pre-treatment periods to implement the estimation strategy. Default is 1. |
anticipation |
how many periods before the treatment actually takes place that it can have an effect on outcomes |
base_period |
The type of base period to use. This only affects the numeric value of results in pre-treatment periods. Results in post-treatment periods are not affected by this choice. The default is "varying", where the base period will "back up" to the immediately preceding period in pre-treatment periods. The other option is "universal" where the base period is fixed in pre-treatment periods to be the period right before the treatment starts. "Universal" is commonly used in difference-in-differences applications, but can be unnatural for other identification strategies. |
cband |
whether or not to report a uniform (instead of pointwise) confidence band (default is TRUE) |
alp |
significance level; default is 0.05 |
boot_type |
which type of bootstrap to use |
weightsname |
The name of the column that contains sampling weights. The default is NULL, in which case no sampling weights are used. |
gt_type |
which type of group-time effects are computed.
The default is "att". Different estimation strategies can implement
their own choices for |
ret_quantile |
For functions that compute quantile treatment effects,
this is a specific quantile at which to report results, e.g.,
|
biters |
number of bootstrap iterations; default is 100 |
cl |
number of clusters to be used when bootstrapping; default is 1 |
call |
keeps track of through the |
... |
additional arguments |
pte_params
object
This is a lightweight (example) function for how to setup
the data to be used in the ptetools
package.
setup_pte_basic
takes in information about the structure of data
and returns a pte_params
object. The key piece of information
that is computed by this function is the list of groups and list of
time periods where ATT(g,t) should be computed. In particular, this function
omits the never-treated group but includes all other groups and drops the first
time period. This setup is basically geared towards the 2x2 case —
i.e., where ATT could be identified with two periods, a treated and
untreated group, and the first period being pre-treatment for both groups.
This is the relevant case for DID, but is also relevant for other cases as well.
However, for example, if more pre-treatment periods were needed, then this
function should be replaced by something else.
For code that is written with the idea of being easy-to-use by other researchers, this is a good place to do some error handling / checking that the data is in the correct format, etc.
setup_pte_basic( yname, gname, tname, idname, data, cband = TRUE, alp = 0.05, boot_type = "multiplier", gt_type = "att", ret_quantile = 0.5, biters = 100, cl = 1, call = NULL, ... )
setup_pte_basic( yname, gname, tname, idname, data, cband = TRUE, alp = 0.05, boot_type = "multiplier", gt_type = "att", ret_quantile = 0.5, biters = 100, cl = 1, call = NULL, ... )
yname |
Name of outcome in |
gname |
Name of group in |
tname |
Name of time period in |
idname |
Name of id in |
data |
balanced panel data |
cband |
whether or not to report a uniform (instead of pointwise) confidence band (default is TRUE) |
alp |
significance level; default is 0.05 |
boot_type |
which type of bootstrap to use |
gt_type |
which type of group-time effects are computed.
The default is "att". Different estimation strategies can implement
their own choices for |
ret_quantile |
For functions that compute quantile treatment effects,
this is a specific quantile at which to report results, e.g.,
|
biters |
number of bootstrap iterations; default is 100 |
cl |
number of clusters to be used when bootstrapping; default is 1 |
call |
keeps track of through the |
... |
additional arguments |
pte_params
object
A function for computing a 2x2 subset of original data. This is the subset with post treatment periods separately for the treated group and comparison group and pre-treatment periods in the period immediately before the treated group became treated.
two_by_two_subset( data, g, tp, control_group = "notyettreated", anticipation = 0, base_period = "varying", ... )
two_by_two_subset( data, g, tp, control_group = "notyettreated", anticipation = 0, base_period = "varying", ... )
data |
the full dataset |
g |
the current group |
tp |
the current time period |
control_group |
whether to use "notyettreated" (default) or "nevertreated" |
anticipation |
the number of periods of anticipation (i.e., number of periods before the treatment happens where the treatment can "already" affect the outcome) |
base_period |
The type of base period to use. This only affects the numeric value of results in pre-treatment periods. Results in post-treatment periods are not affected by this choice. The default is "varying", where the base period will "back up" to the immediately preceding period in pre-treatment periods. The other option is "universal" where the base period is fixed in pre-treatment periods to be the period right before the treatment starts. "Universal" is commonly used in difference-in-differences applications, but can be unnatural for other identification strategies. |
... |
extra arguments to get the subset correct |
list that contains the following elements:
gt_data
: a gt_data_frame
object that contains the
correct subset of data
n1
: the number of observations in this subset
disidx
: a vector of the correct ids for this subset