9 Fixed Effects

PDF version

library(plm) # estimating panel models
library(lmtest) # regression inference
library(stargazer) # regression outputs

9.1 Time-constant Variables

Panel data allows us to control for variables that are constant over time, even if these variables are not directly observable.

Consider a basic panel regression model: Y_{it} = \beta_1 + \beta_2 X_{it} + \beta_3 Z_i + u_{it}. \tag{9.1} Here, Z_i represents a variable that does not change over time and is specific to an individual (e.g., gender, ethnicity, parental education).

For simplicity, assume here that observations are only available for two time periods (t=1 and t=2). We can focus on the changes between these periods.

Subtracting the right-hand side of Equation 9.1 at t=1 from t=2 gives \begin{align*} &\beta_1 + \beta_2 X_{i2} + \beta_3 Z_i + u_{i2} - (\beta_1 + \beta_2 X_{i1} + \beta_3 Z_i + u_{i1}) \\ &= \beta_2 \Delta X_{i2} + \Delta u_{i2}. \end{align*} The symbol \Delta represents first-differencing, i.e. \Delta X_{i2} = X_{i2} - X_{i1} and \Delta u_{i2} = u_{i2} - u_{i1}.

By first-differencing both sides of Equation 9.1, our model becomes \Delta Y_{i2} = \beta_2 \Delta X_{i2} + \Delta u_{i2}. \tag{9.2} \beta_1 and \beta_3 Z_i do not appear in the transformed model Equation 9.2 because they are time-constant and cancel out.

In this differenced model, \beta_2 can be estimated by regressing \Delta Y_{i2} on \Delta X_{i2} without an intercept. This regression isolates the marginal effect of X_{it} on Y_{it} conditional on any unobserved individual characteristics like Z_i. \beta_2 is the marginal effect of X_{it} on Y_{it} given the same individual-specific time-constant characteristics.

We can control for any time-constant variable without actually observing it. This is a remarkable advantage over conventional cross-sectional regression or pooled panel regression.

We may combine the terms \beta_1 and \beta_3 Z_i and define the individual-specific effect \alpha_i = \beta_1 + \beta_3 Z_i. The term \alpha_i is also called individual fixed effect. The fixed effect cancels out after taking first differences.

9.2 Fixed Effects Regression

Consider a panel dataset with dependent variable Y_{it}, a vector of k independent variables \boldsymbol X_{it}, and an individual fixed effect \alpha_i for i=1, \ldots, n and t=1, \ldots, T.

Because \alpha_i already represents any time-constant variable of individual i, we assume that all variables in \boldsymbol X_{it} are time-varying. That is, \boldsymbol X_{it} neither contains an intercept nor any time-constant variables like gender, birthplace, etc.

Fixed-effects Regression

The fixed-effects regression model equation for individual i=1, \ldots, n and time t=1, \ldots, T is Y_{it} = \alpha_i + \boldsymbol X_{it}'\boldsymbol \beta + u_{it}, \tag{9.3} where \boldsymbol \beta = (\beta_1, \ldots, \beta_k)' is the k \times 1 vector of regression coefficients and u_{it} is the error term for individual i at time t.

The fixed effects regression assumptions are:

(A1-fe) conditional mean independence: E[u_{it} | \boldsymbol X_{i1}, \ldots, \boldsymbol X_{iT}, \alpha_i] = 0.
(A2-fe) random sampling: (\alpha_i, Y_{i1}, \ldots, Y_{iT}, \boldsymbol X_{i1}', \ldots, \boldsymbol X_{iT}') are i.i.d. draws from their joint population distribution for i=1, \ldots, n.
(A3-fe) large outliers unlikely: 0 < E[Y_{it}^4] < \infty, 0 < E[u_{it}^4] < \infty.
(A4-fe) no perfect multicollinearity: \boldsymbol X has full column rank.

9.3 Differenced Estimator

The first-differencing transformation can be used to estimate Equation 9.3: \Delta Y_{it} = Y_{i,t} - Y_{i,t-1}, \quad \Delta \boldsymbol X_{it} = \boldsymbol X_{i,t} - \boldsymbol X_{i,t-1}. Taking first differences on both sides of Equation 9.3 implies \Delta Y_{it} = (\Delta \boldsymbol X_{it})' \boldsymbol \beta + \Delta u_{it}, \tag{9.4} where \Delta u_{it} = u_{i,t} - u_{i,t-1}. Notice that the fixed effect \alpha_i cancels out.

Hence, we can apply the OLS principle to Equation 9.4 to estimate \boldsymbol \beta. We regress the differenced dependent variable \Delta Y_{it} on the differenced regressors \Delta \boldsymbol X_{it} for i=1, \ldots, n and t=2, \ldots, T.

A problem with this differenced estimator is that the transformed error term \Delta u_{it} defines an artificial correlation structure, which makes the estimator non-optimal. \Delta u_{i,t+1} = u_{i,t+1} - u_{i,t} is correlated with \Delta u_{i,t} = u_{i,t} - u_{i,t-1} through u_{i,t}.

data(Grunfeld, package="plm")
fit.diff = plm(inv ~ capital-1,
               index = c("firm", "year"),
               effect = "individual",
               model = "fd",
               data=Grunfeld)
fit.diff


Model Formula: inv ~ capital - 1

Coefficients:
capital 
0.23078

9.4 Within Estimator

An efficient estimator can be obtained by a different transformation. The idea is to consider the individual specific means \overline Y_{i\cdot} = \frac{1}{T} \sum_{t=1}^T Y_{it}, \quad \overline{\boldsymbol X}_{i\cdot} = \frac{1}{T} \sum_{t=1}^T \boldsymbol X_{it}, \quad \overline{u}_{i\cdot} = \frac{1}{T} \sum_{t=1}^T u_{it}. Taking the means of both sides of Equation 9.3 implies \overline{Y}_{i\cdot} = \alpha_i + \overline{\boldsymbol X}_{i\cdot}'\boldsymbol \beta + \overline{u}_{i\cdot}. \tag{9.5}

Then, subtracting Equation 9.5 from Equation 9.3 removes the fixed effect \alpha_i from the equation: Y_{it} - \overline Y_{i\cdot} = (\boldsymbol X_{it} - \overline{\boldsymbol X}_{i\cdot})'\boldsymbol \beta + (u_{it} - \overline{u}_{i\cdot}).

The deviations from the individual specific means are called within transformations: \dot Y_{it} = Y_{it} - \overline Y_{i\cdot}, \quad \dot{\boldsymbol X}_{it} = \boldsymbol X_{it} - \overline{\boldsymbol X}_{i\cdot}, \quad \dot u_{it} = u_{it} - \overline{u}_{i\cdot} The within-transfromed model equation is \dot Y_{it} = \dot{\boldsymbol X}_{it}'\boldsymbol \beta + \dot u_{it}. \tag{9.6}

Hence, to estimate \boldsymbol \beta, we regress the within-transformed dependent variable \dot Y_{it} on the within-transformed regressors \dot{\boldsymbol X}_{it} for i=1, \ldots, n and t=1, \ldots, T.

The within estimator is also called fixed effects estimator: \widehat{\boldsymbol \beta}_{\text{fe}} = \bigg( \sum_{i=1}^n \sum_{t=1}^T \dot{\boldsymbol X}_{it} \dot{\boldsymbol X}_{it}' \bigg)^{-1} \bigg( \sum_{i=1}^n \sum_{t=1}^T \dot{\boldsymbol X}_{it} \dot Y_{it} \bigg).

fit.fe = plm(inv ~ capital,
             index = c("firm", "year"),
             effect = "individual",
             model = "within",
             data=Grunfeld)
fit.fe


Model Formula: inv ~ capital

Coefficients:
capital 
0.37075

Under (A2-fe), the collection of the within-transformed variables if individual i, (\dot Y_{i1}, \ldots, \dot Y_{iT}, \dot{\boldsymbol X}_{i1}, \ldots, \dot{\boldsymbol X}_{iT}, \dot u_{i1}, \ldots, \dot u_{iT}), forms an i..i.d. sequence for i=1, \ldots, n. The within-transformed variables satisfy (A1-pool)–(A4-pool).

Hence, we can apply the cluster-robust covariance matrix estimator of the pooled regression to the within-transformed variables: \widehat{\boldsymbol V}_{\text{fe}} = (\dot{\boldsymbol X}' \dot{\boldsymbol X})^{-1} \sum_{i=1}^N \bigg( \sum_{t=1}^T \dot{\boldsymbol X}_{it} \widehat{u}_{it} \bigg) \bigg( \sum_{t=1}^T \dot{\boldsymbol X}_{it} \widehat{u}_{it} \bigg)' (\dot{\boldsymbol X}' \dot{\boldsymbol X})^{-1}, where \widehat{u}_{it} now represents the residuals of \widehat{\boldsymbol \beta}_{\text{fe}}, and \dot{\boldsymbol X}' \dot{\boldsymbol X} = \sum_{i=1}^N \sum_{t=1}^T \dot{\boldsymbol X}_{it} \dot{\boldsymbol X}_{it}'

## cluster-robust covariance matrix
Vfe = vcovHC(fit.fe)
Vfe

            capital
capital 0.003796144
attr(,"cluster")
[1] "group"

## cluster-robust standard error
sqrt(Vfe)

           capital
capital 0.06161285
attr(,"cluster")
[1] "group"

## t-test
coeftest(fit.fe, vcov. = Vfe)


t test of coefficients:

        Estimate Std. Error t value  Pr(>|t|)    
capital 0.370750   0.061613  6.0174 9.018e-09 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

9.5 Time Fixed Effects

While individual-specific fixed effects allow to control for variables that are constant over time but vary across individuals, we can also control for variables that are constant across individuals but vary over time. For example, if new government regulations are introduced at a certain point in time that affect all individuals.

We denote time fixed effects by \lambda_t. The time effects only regression equation is Y_{it} = \lambda_t + \boldsymbol X_{it}' \boldsymbol \beta + u_{it}. \tag{9.7}

Here, \boldsymbol X_{it} does not contain any variable that is the same for all individuals, because these variables are captured by the time fixed effect.

To remove \lambda_t from the equation, we can subtract time specific means on both sides: Y_{it} - \overline Y_{\cdot t} = (\boldsymbol X_{it} - \overline{\boldsymbol X}_{\cdot t})' \boldsymbol \beta + (u_{it} - \overline{u}_{\cdot t}). The time specific means are \overline Y_{\cdot t} = \frac{1}{n} \sum_{i=1}^n Y_{it}, \quad \overline{\boldsymbol X}_{\cdot t} = \frac{1}{n} \sum_{i=1}^n \boldsymbol X_{it}, \quad \overline{u}_{\cdot t} = \frac{1}{n} \sum_{i=1}^n u_{it}.

Hence, we regress Y_{it} - \overline Y_{\cdot t} on \boldsymbol X_{it} - \overline{\boldsymbol X}_{\cdot t} to estimate \boldsymbol \beta in Equation 9.7.

fit.timefe = plm(inv ~ capital,
             index = c("firm", "year"),
             effect = "time",
             model = "within",
             data=Grunfeld)
fit.timefe


Model Formula: inv ~ capital

Coefficients:
capital 
0.53826

9.6 Two-way Fixed Effects

We may include both individual fixed effects and time fixed effects. The two-way fixed effects regression equation is Y_{it} = \alpha_i + \lambda_t + \boldsymbol X_{it}' \boldsymbol \beta + u_{it}. \tag{9.8}

Note that \lambda_t and \alpha_i capture any variable that is the same for all individuals or is time constant. Therefore, the variables in \boldsymbol X_{it} must vary both across individuals and over time.

We can use a combination of the different transformations to remove the fixed effects.

Individual specific mean: \overline Y_{i \cdot} = \alpha_i + \overline \lambda + \overline{\boldsymbol X}_{i\cdot}'\boldsymbol \beta + \overline u_{i\cdot}, where \overline \lambda = \frac{1}{T} \sum_{t=1}^T \lambda_t.
Time specific mean: \overline Y_{\cdot t} = \overline \alpha + \lambda_t + \overline{\boldsymbol X}_{\cdot t}'\boldsymbol \beta + \overline u_{\cdot t}, where \overline \alpha = \frac{1}{n} \sum_{i=1}^n \alpha_i.
Total mean: \overline Y = \frac{1}{nT} \sum_{i=1}^n \sum_{t=1}^T Y_{it} = \overline \alpha + \overline \lambda + \overline{\boldsymbol X}'\boldsymbol \beta + \overline u, where \overline{\boldsymbol X} = \frac{1}{nT} \sum_{i=1}^n \sum_{t=1}^T \boldsymbol X_{it} and \overline u = \frac{1}{nT} \sum_{i=1}^n \sum_{t=1}^T u_{it}.

To eliminate the individual and time fixed effects in Equation 9.8, we use the two-way transformation: \begin{align*} \ddot Y_{it} &= Y_{it} - \overline Y_{i \cdot} - \overline Y_{\cdot t} + \overline Y \\ \ddot{\boldsymbol X}_{it} &= {\boldsymbol X}_{it} - \overline{\boldsymbol X}_{i \cdot} - \overline{\boldsymbol X}_{\cdot t} + \overline{\boldsymbol X} \\ \ddot u_{it} &= u_{it} - \overline u_{i \cdot} - \overline u_{\cdot t} + \overline u. \end{align*} Applying the two-way transformation on both sides of Equation 9.8 gives \ddot Y_{it} = \ddot{\boldsymbol X}_{it}'\boldsymbol \beta + \ddot u_{it}. \tag{9.9}

Hence, we estimate \boldsymbol \beta by regressing \ddot Y_{it} on \ddot{\boldsymbol X}_{it}.

fit.2wayfe = plm(inv ~ capital,
             index = c("firm", "year"),
             effect = "twoways",
             model = "within",
             data=Grunfeld)
fit.2wayfe


Model Formula: inv ~ capital

Coefficients:
capital 
 0.4138

Similarly to the pooled and fixed effects estimator, we can use the cluster-robust covariance matrix estimator and cluster-robust standard errors.

## cluster-robust covariance matrix
V2way = vcovHC(fit.2wayfe)
V2way

            capital
capital 0.003241852
attr(,"cluster")
[1] "group"

## cluster-robust standard error
sqrt(Vfe)

           capital
capital 0.06161285
attr(,"cluster")
[1] "group"

## t-test
coeftest(fit.2wayfe, vcov. = V2way)


t test of coefficients:

        Estimate Std. Error t value  Pr(>|t|)    
capital 0.413802   0.056937  7.2677 1.268e-11 ***
---
Signif. codes:  0 '***' 0.001 '**' 0.01 '*' 0.05 '.' 0.1 ' ' 1

9.7 Comparison of panel models

The fixed effects estimators are asymptotically normal under assumptions (A1-fe)–(A4-fe), and the clustered standard errors are consistent.

fit.pool1 = lm(inv~capital, data=Grunfeld)
fit.pool2 = plm(inv~capital, 
            index = c("firm", "year"),
            model = "pooling",
            data=Grunfeld)
cluster_se = list(
  sqrt(diag(vcovHC(fit.pool1))),
  sqrt(diag(vcovHC(fit.pool2))),
  sqrt(diag(vcovHC(fit.fe))),
  sqrt(diag(vcovHC(fit.timefe))),
  sqrt(diag(vcovHC(fit.2wayfe)))
)

stargazer(fit.pool1, fit.pool2, fit.fe, fit.timefe, fit.2wayfe,
  se = cluster_se,
  add.lines=list(
    c("Firm FE", "No", "No","Yes","No","Yes"),
    c("Year FE", "No", "No","No","Yes","Yes"),
    c("Clustered SE", "No", "Yes", "Yes", "Yes", "Yes")
  ),
  type="html",
  omit.stat = "f", df=FALSE,
  dep.var.labels="Gross Investment",
  covariate.labels = "Capital Stock")


	Dependent variable:

	Gross Investment
	OLS	panel
		linear
	(1)	(2)	(3)	(4)	(5)

Capital Stock	0.477^***	0.477^***	0.371^***	0.538^***	0.414^***
	(0.078)	(0.126)	(0.062)	(0.153)	(0.057)

Constant	14.236	14.236
	(19.393)	(28.046)


Firm FE	No	No	Yes	No	Yes
Year FE	No	No	No	Yes	Yes
Clustered SE	No	Yes	Yes	Yes	Yes
Observations	200	200	200	200	200
R²	0.439	0.439	0.660	0.429	0.599
Adjusted R²	0.436	0.436	0.642	0.365	0.530
Residual Std. Error	162.850

Note:	p<0.1; p<0.05; p<0.01

9.8 Dummy variable regression

An alternative way to estimate the fixed effects model is by an OLS regression of Y_{it} on \boldsymbol X_{it} and a full set of dummy variables, one for each individual in the sample.

For the time fixed effects model, we include a full set of dummy variables for each time point in the sample, and for the two-way fixed effects model, we include individual and time dummies.

This approach is algebraically equivalent to the within and two-way transformations. The coefficients for the auxiliary dummy variables are usually not reported. The coefficients for capital are the same as in the table above:

lm(inv ~ capital + factor(firm), data=Grunfeld)


Call:
lm(formula = inv ~ capital + factor(firm), data = Grunfeld)

Coefficients:
   (Intercept)         capital   factor(firm)2   factor(firm)3   factor(firm)4  
      367.6130          0.3707        -66.4553       -413.6821       -326.4410  
 factor(firm)5   factor(firm)6   factor(firm)7   factor(firm)8   factor(firm)9  
     -486.2784       -350.8656       -436.7832       -356.4725       -436.1703  
factor(firm)10  
     -366.7313

lm(inv ~ capital + factor(year), data=Grunfeld)


Call:
lm(formula = inv ~ capital + factor(year), data = Grunfeld)

Coefficients:
     (Intercept)           capital  factor(year)1936  factor(year)1937  
         39.2068            0.5383           22.4605           27.8993  
factor(year)1938  factor(year)1939  factor(year)1940  factor(year)1941  
        -36.6889          -42.4012          -11.4293            5.3301  
factor(year)1942  factor(year)1943  factor(year)1944  factor(year)1945  
        -26.2522          -36.3995          -32.3887          -33.0571  
factor(year)1946  factor(year)1947  factor(year)1948  factor(year)1949  
         -3.6307          -57.8083          -73.1115         -106.8436  
factor(year)1950  factor(year)1951  factor(year)1952  factor(year)1953  
       -105.8753          -69.2505          -76.6097          -67.6766  
factor(year)1954  
       -112.6339

lm(inv ~ capital + factor(firm) + factor(year), data=Grunfeld)


Call:
lm(formula = inv ~ capital + factor(firm) + factor(year), data = Grunfeld)

Coefficients:
     (Intercept)           capital     factor(firm)2     factor(firm)3  
        354.9166            0.4138          -51.2329         -402.9933  
   factor(firm)4     factor(firm)5     factor(firm)6     factor(firm)7  
       -303.7443         -479.3182         -327.4387         -422.4257  
   factor(firm)8     factor(firm)9    factor(firm)10  factor(year)1936  
       -332.2429         -421.0790         -339.0705           23.9405  
factor(year)1937  factor(year)1938  factor(year)1939  factor(year)1940  
         32.9483          -27.0935          -30.7979            0.5826  
factor(year)1941  factor(year)1942  factor(year)1943  factor(year)1944  
         19.5836           -8.6393          -17.5675          -13.7593  
factor(year)1945  factor(year)1946  factor(year)1947  factor(year)1948  
        -13.5253           17.6985          -27.2407          -37.4300  
factor(year)1949  factor(year)1950  factor(year)1951  factor(year)1952  
        -66.7623          -63.2855          -23.9098          -23.9138  
factor(year)1953  factor(year)1954  
         -5.1266          -40.1051

9.9 Panel R-squared

We can decompose the total variation into within group variation and between group variation: Y_{it}- \overline Y = \underbrace{Y_{it} - \overline{Y}_{i \cdot}}_{\text{within group}} + \underbrace{\overline{Y}_{i \cdot} - \overline Y}_{\text{between group}}

Two different R squared versions:

Overall R-squared: R^2_{ov} = 1 - \frac{\sum_{i=1}^n \sum_{t=1}^T \widehat u_{it}^2}{\sum_{i=1}^n \sum_{t=1}^T (Y_{it} - \overline Y)^2} Interpretation: Proportion of total sample variation in Y_{it} explained by the model (the usual R-squared).
Within R-squared R^2_{wit} = 1 - \frac{\sum_{i=1}^n \sum_{t=1}^T \widehat u_{it}^2}{\sum_{i=1}^n \sum_{t=1}^T (Y_{it} - \overline{Y}_{i \cdot})^2} Interpretation: Proportion of sample variation in Y_{it} within the individual units is explained by the model.

For a individual-specific fixed effects regression, consider the two equivalent fixed effects estimators from above:

## plm object
fit.fe = plm(inv ~ capital,
             index = c("firm", "year"),
             effect = "individual",
             model = "within",
             data=Grunfeld)
## lm object
fit.fe.lsdv = lm(inv ~ capital + factor(firm), data=Grunfeld)

The summary(object)$r.squared function applied to the plm object returns the within R-squared, and for the lm object it returns the overall R-squared:

## within R-squared
summary(fit.fe)$r.squared

      rsq    adjrsq 
0.6597327 0.6417291

## overall R-squared
summary(fit.fe.lsdv)$r.squared

[1] 0.9184098

It is not a big surprise that the fixed effects model explains a lot of the total variation in Y_{it}. The equivalent LSDV model assigns each individual its own dummy variable and therefore, by construction, explains a lot of variation between individuals.

The within R squared is often more insightful because it reflects the model’s ability to explain the variation within entities over time.

9.10 R-codes

methods-sec09.R