example_data_functions.example_data#

example_data_functions.example_data(obs_y_d_x_iate=1000, obs_x_iate=1000, no_features=20, no_treatments=3, type_of_heterogeneity='WagerAthey', seed=12345, descr_stats=True)#

Create example data to be used with mcf estimation and optimal policy.

Parameters

obs_y_d_x_iate (Integer, optional) – Number of observations for training data. The default is 1000.
obs_x_iate (Integer, optional) – Number of observations for prediction data. The default is 1000.
no_features (Integer, optional) – Number of features of different type. The default is 20.
no_treatments (Integer, optional) – Number of treatments (all non-zero treatments have same IATEs). The default is 3.
type_of_heterogeneity (String, optional) – Different types of heterogeneity broadly (but not exactly) following the specifications used in the simulations of Lechner and Mareckova (Comprehensive Causal Machine Learning, arXiv, 2024). Possible types are ‘linear’, ‘nonlinear’, ‘quadratic’, ‘WagerAthey’.
seed (Integer, optional) – Seed of numpy random number generator object. The default is 12345.

Returns

train_df (DataFrame) – Contains outcome, treatment, features, potential outcomes, IATEs, ITEs, and zero column (for convenience to be used with OptimalPolicy).
pred_df (DataFrame) – Contains features, potential outcomes, IATEs, ITEs.
name_dict (Dictionary) – Contains the names of the variable groups.