Simple Difference-in-Differences
The difference-in-differences (DID) estimator is a popular method in econometrics to estimate causal effects.
In the simplest regression model, we can have the following equation:
yit=α+βDt+ϵit,i=1,…,N,t=0,1
where:
- yit is the outcome variable for individual i at time t,
- Dt is the treatment variable, which equals 1 (postintervention) if t=1 and 0 (preintervention) if t=0,
β can be estimated by the following regression:
β^=N∑i=1Nyi1−yi0
yi1=α+β+ϵi1yi0=α+ϵi0yi1−yi0=β+ϵi1−ϵi0
The above regression can be modified to include an untreated control group:
yitj=α+α1Dt+α1Dj+βDtj+ϵitj,i=1,…,N,t=0,1
where:
- yitj is the outcome variable for individual i in group j at time t,
- Dj=1 if individual i is in the treatment group and 0 if individual i is in the control group,
- Dtj=1 if t=1 and j=1 and 0 otherwise
This regression can be known as the difference-in-differences estimator because it estimates the difference between the treatment and control groups before and after the intervention.
yi11=α+α1+α1+β+ϵi11
The preinvention period is t=0 and j=1, so Dtj=0 and Dt=0.
yi01=α+α1+ϵi01
The difference between the postintervention period and the preintervention period is:
yi11−yi01=α1+β+ϵi11−ϵi01
For the control group:
The postintervention period is t=1 and j=0, so Dtj=0 and Dt=1.
yi10=α+α1+ϵi10
The preinvension period is t=0 and j=0, so Dtj=0 and Dt=0.
yi00=α+ϵi00
The difference between the postintervention period and the preintervention period is:
yi10−yi00=α1+ϵi10−ϵi00
Finally, we take the difference between the treatment difference and control groups difference:
(yi11−yi01)−(yi10−yi00)=(α1+β+ϵi11−ϵi01)−(α1+ϵi10−ϵi00)=β+ϵi11−ϵi01−ϵi10+ϵi00
Then, assume that the error terms are independent and identically distributed (i.i.d) with mean 0 and variance σ2:
E[ϵi11−ϵi01−ϵi10+ϵi00]=0
Therefore, the difference-in-differences estimator is:
β^=N∑i=1N(yi11−yi01)−(yi10−yi00)