When considering model selection criteria for nested statistical models, AIC and BIC usually comes to our mind. These two metrics are derived base on the maximized log-likelihood value with a tradeoff of number of predictors.
1a. Final Goal: Goal to create a counterfactual dataset
Ideally, we want a counterfactual data for each observation:
for i = 1, …, n:
- Outcome if treated: Yi(A=1)
- Outcome if control: Yi(A=0)
However, in reality, we only observe 1 outcome for each observation
1b. What does TMLE comes from?
P(Y, A, W) = P(Y|A, W) * P(A|W) * P(W)
If we just do regular regression Y ~ A + W, then in order to have…
A Newbie Data Scientist in NYC