5) Could the Federal Reserve Prevent the Great Depression?¶

E-mail: econometrics.methods@gmail.com

Last updated: 10-12-2020

Concerned the Great Depression, neoclassical economists believe that the decline of economic activity “caused” the bank failures. Keynes (1936) believed the opposite, that is, the bank insolvencies led to business failures.

Richardson & Troost (2009) noticed that during the Great Depression, the state of Mississippi was divided into two districts controlled by different branches of Federal Reserve (Fed): St. Louis and Atlanta.

Differently from the St. Louis Fed that made more onerous to borrow money, the Atlanta Fed adopted a Keynesian policy of discount lending and emergency liquidity to illiquid banks.

Let’s open the data from Ziebarth (2013). Each row is a firm from the Census of Manufactures (CoM) for 1929, 1931, 1933, and 1935.

# Load data from Ziebarth (2013)
import numpy as np
import pandas as pd
path = "https://github.com/causal-methods/Data/raw/master/" 
data = pd.read_stata(path + "MS_data_all_years_regs.dta")
data.head()

	county	average_num_wage_earners	average_wage_a	average_wage_b	average_wage_d	began	censusyear	...	no_enter_1933	no_exit_1931	no_enter_1935	no_exit_1933	num_products	rrtsap
0		NaN	NaN	NaN	NaN		1933	...	NaN	NaN	NaN	NaN	0.0	NaN
1	Greene	12.000000	0.706404	0.254306	NaN	January 1, 1933	1933	...	NaN	NaN	NaN	NaN	2.0	100.797699
2	Hinds	4.000000	0.242670	0.215411	NaN	April 1, 1933	1933	...	NaN	NaN	NaN	NaN	1.0	404.526093
3	Wayne	36.000000	0.064300	0.099206	NaN	January 1, 1933	1933	...	NaN	NaN	NaN	NaN	3.0	104.497620
4	Attala	6.333333	NaN	0.437281	NaN	January 1, 1933	1933	...	NaN	NaN	NaN	NaN	1.0	148.165237

5 rows × 538 columns

First, we must check how similar were the two districts St. Louis and Atlanta in 1929. All variables are reported in logarithms. The mean revenue of the firms in St. Louis district was 10.88; whereas the mean revenue of the firms in Atlanta district was 10.78. Both St. Louis and Atlanta had similar wage earners (4.54 vs 4.69) and hours per worker (4.07 vs 4) as well.

# Round 2 decimals
pd.set_option('precision', 2)

# Restrict the sample to the year: 1929
df1929 = data[data.censusyear.isin([1929])]

vars= ['log_total_output_value', 'log_wage_earners_total',
       'log_hours_per_wage_earner']

df1929.loc[:, vars].groupby(df1929["st_louis_fed"]).agg([np.size, np.mean])

	log_total_output_value		log_wage_earners_total		log_hours_per_wage_earner
	size	mean	size	mean	size	mean
st_louis_fed
0.0	424.0	10.78	424.0	4.69	424.0	4.00
1.0	367.0	10.88	367.0	4.54	367.0	4.07

Additionally, both St. Louis and Atlanta have a similar mean price (1.72 vs 1.55) and mean quantity (8.63 vs 8.83), if the sample is restricted to firms with 1 product. Therefore, Atlanta district is a reasonable control group for St. Louis district.

# Restrict sample to firms with 1 product
df1929_1 = df1929[df1929.num_products.isin([1])]

per_unit = ['log_output_price_1', 'log_output_quantity_1']

df1929_1.loc[:, per_unit].groupby(df1929_1["st_louis_fed"]).agg([np.size, np.mean])

	log_output_price_1		log_output_quantity_1
	size	mean	size	mean
st_louis_fed
0.0	221.0	1.55	221.0	8.83
1.0	225.0	1.72	225.0	8.63

We want to see if the credit constrained policy of St. Louis Fed decreased the revenue of the firms. Or, in other words, if the Atlanta Fed saved firms from bankruptcy.

For this purpose, we have to explore the time dimension: the comparison of the firm revenue before 1929 and after 1931.

Let’s restrict the sample to the years 1929 and 1931. Then, let’s drop the missing values.

# Restrict the sample to the years: 1929 and 1931
df = data[data.censusyear.isin([1929, 1931])]

vars = ['firmid', 'censusyear', 'log_total_output_value',
        'st_louis_fed', 'industrycode', 'year_1931']

# Drop missing values        
df = df.dropna(subset=vars)

Now, we can declare a panel data structure, that is, to set the unit of analysis and the time dimension. See that the variables “firmid” and “censusyear” became indices in the table. The order matters. The first variable must be the unit of analysis and the second variable must be time unit. See in the table that the firm (id = 12) is observed for two years: 1929 and 1931.

Note that panel data structure was declared after cleaning the data set. For example, if the missing values is dropped after the panel data declaration, the commands for regression will probably return errors.

df = df.set_index(['firmid', 'censusyear'])
df.head()

		county	average_num_wage_earners	average_wage_a	average_wage_b	average_wage_d	began	citytownorvillage	...	no_enter_1933	no_exit_1931	no_enter_1935	no_exit_1933	balanced_1931	balanced_1933	balanced_1935	num_products	rrtsap	delta_indic
firmid	censusyear
12	1929	Oktibbeha	1.75	0.38	0.50	NaN	January 1, 1929	A & M College	...	NaN	NaN	NaN	NaN	1.0	0.0	1.0	1.0	328.31	0.0
12	1931	Oktibbeha	1.75	0.25	0.30	NaN	January 1, 1931	A and M College	...	NaN	1.0	NaN	NaN	1.0	0.0	0.0	1.0	NaN	0.0
13	1929	Warren	7.25	0.49	0.58	NaN	January 1, 1929	Clarksburg	...	NaN	NaN	NaN	NaN	1.0	0.0	1.0	2.0	670.07	1.0
13	1931	Warren	3.50	NaN	0.71	NaN	January 1, 1931	Vicksburg,	...	NaN	1.0	NaN	NaN	1.0	0.0	0.0	2.0	NaN	1.0
14	1929	Monroe	12.92	0.17	0.23	NaN	January 1, 1929	Aberdeen	...	NaN	NaN	NaN	NaN	1.0	0.0	1.0	2.0	314.90	0.0

5 rows × 536 columns

Let’s explain the advantages of panel data over the cross-sectional data. The last is a snapshot of one point or period of time; whereas in a panel data, the same unit of analysis is observed over time.

Let \(Y_{it}\) the outcome variable of unit \(i\) on time \(t\). The dummy variable \(d2_{it}\) is 1 for the second period, and 0 for the first period. Note that the explanatory variable \(X_{it}\) varies over unit \(i\) and time \(t\), but the unobserved factor \(\alpha_i\) doesn’t vary over the time. Unobserved factor is an unavailable variable (data) that might be correlated with the variable of interest, generating bias in the results.

\[Y_{it}=\beta_0+\delta_0d2_{it}+\beta_1X_{it}+\alpha_i+\epsilon_{it}\]

The advantage of exploring the time variation is that the unobserved factor \(\alpha_i\) can be eliminated by a First-Difference (FD) method.

In the second period (\(t=2\)), the time dummy \(d2=1\):

\[Y_{i2}=\beta_0+\delta_0+\beta_1X_{i2}+\alpha_i+\epsilon_{i2}\]

In the first period (\(t=1\)), the time dummy \(d2=0\):

\[Y_{i1}=\beta_0+\beta_1X_{i1}+\alpha_i+\epsilon_{i1}\]

Then:

\[Y_{i2}-Y_{i1}\]

\[=\delta_0+\beta_1(X_{i2}-X_{i1})+\epsilon_{i2}-\epsilon_{i1}\]

\[\Delta Y_i=\delta_0+\beta_1\Delta X_i+\Delta \epsilon_i\]

Therefore, if the same units are observed over time (panel data), no need to worry about a factor that can be considered constant over the time analyzed. We can assume that the company culture and institutional practices don’t vary much over a short period of time. These factors are likely to explain the difference in revenue among the firms but will not bias the result if the assumption above is correct.

Let’s install the library that can run the panel data regressions.

!pip install linearmodels

Requirement already satisfied: linearmodels in c:\anaconda\envs\textbook\lib\site-packages (4.17)
Requirement already satisfied: numpy>=1.15 in c:\anaconda\envs\textbook\lib\site-packages (from linearmodels) (1.19.2)

WARNING: No metadata found in c:\anaconda\envs\textbook\lib\site-packages
ERROR: Could not install packages due to an EnvironmentError: [Errno 2] No such file or directory: 'c:\\anaconda\\envs\\textbook\\lib\\site-packages\\numpy-1.19.2.dist-info\\METADATA'

Let’s use the Difference-in-Differences (DID) method to estimate the impact of St. Louis Fed policy on firm revenue. In addition to explore the time difference, the treatment-control difference must be used to estimate the causal impact of the policy.

Let \(Y\) be the outcome variable ‘log_total_output_value’, \(d2\) the time dummy variable ‘year_1931’, \(dT\) the treatment dummy variable ‘st_louis_fed’, and \(d2 \cdot dT\) the interaction term between the previous two dummies:

\[Y = \beta_0+\delta_0d2+\beta_1dT+\delta_1 (d2\cdot dT)+ \epsilon\]

The DID estimator is given by \(\delta_1\) and not by \(\beta_1\) or \(\delta_0\). First, we take the difference between “Treatment (St. Louis)” and “Control (Atlanta)”, and then we take the difference between “After (1931)” and “Before (1921)”.

\[\hat{\delta}_1 = (\bar{y}_{2,T}-\bar{y}_{2,C})-(\bar{y}_{1,T}-\bar{y}_{1,C})\]

The order doesn’t matter. If we take first the difference between “After (1931)” and “Before (1921)”, and then the difference between “Treatment (St. Louis)” and “Control (Atlanta)”, the result will be the same \(\delta_1\).

\[\hat{\delta}_1 = (\bar{y}_{2,T}-\bar{y}_{1,T})-(\bar{y}_{2,C}-\bar{y}_{1,C})\]

Let’s show formally that the we must take the difference twice in the DID estimator \(\delta_0\):

If \(d2=0\) and \(dT=0\), then \(Y_{0,0}=\beta_0\).

If \(d2=1\) and \(dT=0\), then \(Y_{1,0}=\beta_0+\delta_0\).

For the control group, the difference “After - Before” is:

\[Y_{1,0}-Y_{0,0}=\delta_0\]

Let’s apply the same reasoning to the treatment group:

If \(d2=0\) and \(dT=1\), then \(Y_{0,1}=\beta_0 + \beta_1\).

If \(d2=1\) and \(dT=1\), then \(Y_{1,1}=\beta_0+\delta_0+ \beta_1+\delta_1\).

For the treatment group, the difference “After - Before” is:

\[Y_{1,1}-Y_{0,1}=\delta_0+\delta_1\]

Then, if we take the the difference “Treatment - Control”, we get:

\[\delta_1\]

alt text

Let’s manually calculate the \(\hat{\delta}_1\) from the numbers in the graphic “Firm’s Revenue during the Great Depression”. Note that in the Difference-in-Differences (DID) method, a counterfactual is constructed based on the control group (Atlanta). It is just a parallel shift of Atlanta line. The counterfactual is the hypothetical outcome for the treatment group (St. Louis), if St. Louis Fed had followed the same policy of Atlanta Fed.

\[\hat{\delta}_1 = (\bar{y}_{2,T}-\bar{y}_{2,C})-(\bar{y}_{1,T}-\bar{y}_{1,C})\]

\[=(10.32-10.42)-(10.87-10.78)\]

\[=(-0.1)-(-0.1)\]

\[=-0.2\]

The restrictive credit policy of St. Louis Fed decreased in about 20% the revenue of the firms. The result of a simple mean comparison in the end of 1931 is only -10%. Therefore, without using the counterfactual reasoning, the negative impact of the St. Louis Fed policy would be large underestimated.

# Mean Revenue for the Graphic
table = pd.crosstab(df['year_1931'], df['st_louis_fed'], 
        values=df['log_total_output_value'], aggfunc='mean')

# Build Graphic
import plotly.graph_objects as go
fig = go.Figure()

# x axis
year = [1929, 1931]

# Atlanta Line
fig.add_trace(go.Scatter(x=year, y=table[0],
                         name='Atlanta (Control)'))
# St. Louis Line
fig.add_trace(go.Scatter(x=year, y=table[1],
                         name='St. Louis (Treatment)'))
# Counterfactual
end_point = (table[1][0] - table[0][0]) + table[0][1]
counter = [table[1][0], end_point]
fig.add_trace(go.Scatter(x=year, y= counter,
                         name='Counterfactual',
                         line=dict(dash='dot') ))

# Difference-in-Differences (DID) estimation
fig.add_trace(go.Scatter(x=[1931, 1931],
                         y=[table[1][1], end_point],
                         name='$\delta_1=0.2$',
                         line=dict(dash='dashdot') ))

# Labels
fig.update_layout(title="Firm's Revenue during the Great Depression",
                  xaxis_type='category',
                  xaxis_title='Year',
                  yaxis_title='Log(Revenue)')

fig.show()

The result of Difference-in-Differences (DID) implemented via regression is:

\[\hat{Y} = 10.8-0.35d2+0.095dT-0.20(d2\cdot dT)\]

from linearmodels import PanelOLS

Y = df['log_total_output_value']
df['const'] = 1
df['louis_1931'] = df['st_louis_fed']*df['year_1931']

## Difference-in-Differences (DID) specification
dd = ['const', 'st_louis_fed', 'year_1931', 'louis_1931']

dif_in_dif = PanelOLS(Y, df[dd]).fit(cov_type='clustered',
                                     cluster_entity=True)
print(dif_in_dif)

C:\Anaconda\envs\textbook\lib\site-packages\statsmodels\tools\_testing.py:19: FutureWarning:

pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.

                            PanelOLS Estimation Summary                             
====================================================================================
Dep. Variable:     log_total_output_value   R-squared:                        0.0257
Estimator:                       PanelOLS   R-squared (Between):              0.0145
No. Observations:                    1227   R-squared (Within):               0.2381
Date:                    Fri, Nov 06 2020   R-squared (Overall):              0.0257
Time:                            21:12:12   Log-likelihood                   -2135.5
Cov. Estimator:                 Clustered                                           
                                            F-statistic:                      10.761
Entities:                             938   P-value                           0.0000
Avg Obs:                           1.3081   Distribution:                  F(3,1223)
Min Obs:                           1.0000                                           
Max Obs:                           2.0000   F-statistic (robust):             18.780
                                            P-value                           0.0000
Time periods:                           2   Distribution:                  F(3,1223)
Avg Obs:                           613.50                                           
Min Obs:                           575.00                                           
Max Obs:                           652.00                                           
                                                                                    
                              Parameter Estimates                               
================================================================================
              Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
--------------------------------------------------------------------------------
const            10.781     0.0723     149.13     0.0000      10.639      10.923
st_louis_fed     0.0945     0.1043     0.9057     0.3653     -0.1102      0.2991
year_1931       -0.3521     0.0853    -4.1285     0.0000     -0.5194     -0.1848
louis_1931      -0.1994     0.1237    -1.6112     0.1074     -0.4422      0.0434
================================================================================

C:\Anaconda\envs\textbook\lib\site-packages\linearmodels\panel\data.py:98: FutureWarning:

is_categorical is deprecated and will be removed in a future version.  Use is_categorical_dtype instead

The St. Louis Fed policy decreased the firm revenue in 18% (\(1-e^{-0.1994}\)). However, the p-value is 0.1074. The result is not statistically significant at 10%.

from math import exp
1 - exp(dif_in_dif.params.louis_1931 )

0.18076309464004925

Somebody might argue that the difference among firms is a confound factor. One or another big firm might bias the results.

This issue can be addressed by using Fixed Effects (FE) or Within Estimator. The technique is similar to the First-Difference (FD), but with different data transformation. The time-demeaned process is used to eliminate the unobserved factor \(\alpha_i\).

\[Y_{it}=\beta X_{it}+\alpha_i+\epsilon_{it}\]

Let’s average the variables for each \(i\) over time \(t\):

\[\bar{Y}_{i}=\beta \bar{X}_{i}+\alpha_i+\bar{\epsilon}_{i}\]

Then, we take the difference and the unobserved factor \(\alpha_i\) vanishes:

\[Y_{it}-\bar{Y}_{i}=\beta (X_{it}-\bar{X}_{i})+\epsilon_{it}-\bar{\epsilon}_{i}\]

We can write the equation above in a more compact way:

\[\ddot{Y}_{it}=\beta \ddot{X}_{it}+\ddot{\epsilon}_{it}\]

As we declared previously that the firm is the unit of analysis in this panel data set, the computer implements the Firm Fixed Effects (FE) automatically with the command “entity_effects=True”.

We added Firm Fixed Effects (FE) to the Difference-in-Differences (DID) specification and the result didn’t change much. The intuition is that Difference-in-Differences (DID) technique had already mitigated the endogeneity problems.

The St. Louis Fed policy decreased the firm revenue in 17% (\(1-e^{-0.1862}\)). The result is statistically significant at 10%.

firmFE = PanelOLS(Y, df[dd], entity_effects=True)
print(firmFE.fit(cov_type='clustered', cluster_entity=True))

                            PanelOLS Estimation Summary                             
====================================================================================
Dep. Variable:     log_total_output_value   R-squared:                        0.2649
Estimator:                       PanelOLS   R-squared (Between):             -0.4554
No. Observations:                    1227   R-squared (Within):               0.2649
Date:                    Fri, Nov 06 2020   R-squared (Overall):             -0.4266
Time:                            21:12:12   Log-likelihood                   -202.61
Cov. Estimator:                 Clustered                                           
                                            F-statistic:                      34.361
Entities:                             938   P-value                           0.0000
Avg Obs:                           1.3081   Distribution:                   F(3,286)
Min Obs:                           1.0000                                           
Max Obs:                           2.0000   F-statistic (robust):             31.245
                                            P-value                           0.0000
Time periods:                           2   Distribution:                   F(3,286)
Avg Obs:                           613.50                                           
Min Obs:                           575.00                                           
Max Obs:                           652.00                                           
                                                                                    
                              Parameter Estimates                               
================================================================================
              Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
--------------------------------------------------------------------------------
const            9.9656     0.4511     22.091     0.0000      9.0777      10.854
st_louis_fed     1.9842     1.0365     1.9144     0.0566     -0.0559      4.0243
year_1931       -0.3666     0.0657    -5.5843     0.0000     -0.4959     -0.2374
louis_1931      -0.1862     0.0982    -1.8965     0.0589     -0.3794      0.0071
================================================================================

F-test for Poolability: 6.8220
P-value: 0.0000
Distribution: F(937,286)

Included effects: Entity

The Fixed Effects (FE) can be manually implemented by adding dummy variables. There are different Fixed Effects. Let’s add Industry Fixed Effects to the Difference-in-Differences (DID) specification to discard the possibility that the results might be driven by Industry specific shocks.

The St. Louis Fed policy decreased the firm revenue in 14.2% (\(1-e^{-0.1533}\)). The result is statistically significant at 10%.

Why not add Firm and Industry Fixed Effects in the same time? It is possible and recommendable, but the computer will not return any result given the problem of multicollinearity. We have only two observations (2 years) per firm. If we add one dummy variable for each firm, it is like to run a regression with more variables than observations.

In his paper Ziebarth (2013) presents results using Firm and Industry Fixed Effects, how is it possible? Ziebarth (2013) used Stata software. Stata automatically drops some variables in the case of multicollinearity problem and outputs a result. Although this practice is well-diffused in Top Journals of Economics, it is not the “true” Fixed Effects.

industryFE = PanelOLS(Y, df[dd + ['industrycode']])
print(industryFE.fit(cov_type='clustered', cluster_entity=True))

                            PanelOLS Estimation Summary                             
====================================================================================
Dep. Variable:     log_total_output_value   R-squared:                        0.5498
Estimator:                       PanelOLS   R-squared (Between):              0.5462
No. Observations:                    1227   R-squared (Within):               0.3913
Date:                    Fri, Nov 06 2020   R-squared (Overall):              0.5498
Time:                            21:12:12   Log-likelihood                   -1661.9
Cov. Estimator:                 Clustered                                           
                                            F-statistic:                      29.971
Entities:                             938   P-value                           0.0000
Avg Obs:                           1.3081   Distribution:                 F(48,1178)
Min Obs:                           1.0000                                           
Max Obs:                           2.0000   F-statistic (robust):          4.791e+15
                                            P-value                           0.0000
Time periods:                           2   Distribution:                 F(48,1178)
Avg Obs:                           613.50                                           
Min Obs:                           575.00                                           
Max Obs:                           652.00                                           
                                                                                    
                                 Parameter Estimates                                 
=====================================================================================
                   Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
-------------------------------------------------------------------------------------
const                 10.391     0.2249     46.201     0.0000      9.9497      10.832
st_louis_fed         -0.1740     0.0805    -2.1610     0.0309     -0.3319     -0.0160
year_1931            -0.3163     0.0660    -4.7922     0.0000     -0.4458     -0.1868
louis_1931           -0.1533     0.0916    -1.6741     0.0944     -0.3331      0.0264
industrycode.1005    -0.1966     0.3637    -0.5407     0.5888     -0.9101      0.5169
industrycode.101      0.2349     0.2403     0.9775     0.3285     -0.2366      0.7064
industrycode.1014    -0.1667     1.0812    -0.1542     0.8775     -2.2880      1.9546
industrycode.103      1.5898     0.2938     5.4104     0.0000      1.0133      2.1663
industrycode.104      1.1349     0.2818     4.0280     0.0001      0.5821      1.6877
industrycode.105      0.9660     0.4191     2.3049     0.0213      0.1437      1.7882
industrycode.107      0.9374     0.3073     3.0502     0.0023      0.3344      1.5404
industrycode.110      0.2299     0.3447     0.6671     0.5049     -0.4463      0.9062
industrycode.111      2.8271     0.4097     6.9010     0.0000      2.0233      3.6308
industrycode.112      0.2676     0.4185     0.6394     0.5227     -0.5535      1.0888
industrycode.114      2.7559     0.4623     5.9612     0.0000      1.8489      3.6630
industrycode.116      0.4498     0.4351     1.0337     0.3015     -0.4039      1.3035
industrycode.117     -0.1337     0.5436    -0.2461     0.8057     -1.2002      0.9327
industrycode.118      0.1025     0.2483     0.4127     0.6799     -0.3847      0.5896
industrycode.119     -0.3121     0.2336    -1.3360     0.1818     -0.7705      0.1462
industrycode.1204    -0.0102     0.3031    -0.0338     0.9731     -0.6048      0.5844
industrycode.123      0.2379     0.7093     0.3354     0.7374     -1.1537      1.6295
industrycode.126      1.8847     0.4828     3.9039     0.0001      0.9375      2.8319
industrycode.128     -0.0025     0.5277    -0.0047     0.9963     -1.0377      1.0328
industrycode.1303     2.6410     0.2293     11.519     0.0000      2.1912      3.0908
industrycode.136      0.2537     0.2493     1.0178     0.3090     -0.2354      0.7428
industrycode.1410    -1.4091     0.2967    -4.7496     0.0000     -1.9912     -0.8270
industrycode.1502     1.4523     0.3858     3.7644     0.0002      0.6954      2.2093
industrycode.1604    -0.5652     0.4195    -1.3473     0.1781     -1.3882      0.2579
industrycode.1624    -1.0476     1.0108    -1.0365     0.3002     -3.0307      0.9355
industrycode.1640    -0.8320     0.3613    -2.3030     0.0215     -1.5409     -0.1232
industrycode.1676     0.5830     0.2512     2.3205     0.0205      0.0901      1.0759
industrycode.1677    -0.7025     0.2346    -2.9947     0.0028     -1.1627     -0.2422
industrycode.203     -0.6435     0.2241    -2.8712     0.0042     -1.0832     -0.2038
industrycode.210B     1.7535     0.2293     7.6482     0.0000      1.3037      2.2033
industrycode.216      2.7641     0.3549     7.7888     0.0000      2.0679      3.4604
industrycode.234a     1.6871     0.5410     3.1187     0.0019      0.6258      2.7485
industrycode.265      1.2065     0.4193     2.8774     0.0041      0.3838      2.0292
industrycode.266     -0.3244     0.2736    -1.1857     0.2360     -0.8612      0.2124
industrycode.304      1.2156     0.4982     2.4400     0.0148      0.2382      2.1930
industrycode.314      0.8008     0.2411     3.3211     0.0009      0.3277      1.2740
industrycode.317     -0.1470     0.2750    -0.5347     0.5930     -0.6866      0.3925
industrycode.515     -0.4623     0.2862    -1.6153     0.1065     -1.0238      0.0992
industrycode.518     -0.5243     0.2887    -1.8157     0.0697     -1.0908      0.0422
industrycode.520     -0.2234     0.2731    -0.8179     0.4136     -0.7592      0.3124
industrycode.614      1.8029     0.3945     4.5707     0.0000      1.0290      2.5768
industrycode.622      2.9239     0.2430     12.035     0.0000      2.4473      3.4006
industrycode.633      0.0122     1.5690     0.0078     0.9938     -3.0661      3.0906
industrycode.651      1.9170     0.8690     2.2060     0.0276      0.2121      3.6220
industrycode.652      2.5495     0.2211     11.531     0.0000      2.1158      2.9833
=====================================================================================

Just for exercise purpose, suppose that the unobserved factor \(\alpha_i\) is ignored. This assumption is called Random Effects (RE). In this case, \(\alpha_i\) will be inside the error term \(v_{it}\) and potentially biased the results.

\[Y_{it}=\beta X_{it}+v_{it}\]

\[v_{it}= \alpha_i+\epsilon_{it}\]

In an experiment, the treatment variable is uncorrelated with the unobserved factor \(\alpha_i\). In this case, Random Effects (RE) model has the advantage of producing lower standard errors than the Fixed Effects models.

Note that if we run a simple Random Effects (RE) regression, we might conclude wrongly that St. Louis Fed policy increased the firm revenue in 7%.

from linearmodels import RandomEffects
re = RandomEffects(Y, df[['const', 'st_louis_fed']])
print(re.fit(cov_type='clustered', cluster_entity=True))

                          RandomEffects Estimation Summary                          
====================================================================================
Dep. Variable:     log_total_output_value   R-squared:                        0.3689
Estimator:                  RandomEffects   R-squared (Between):             -0.0001
No. Observations:                    1227   R-squared (Within):               0.0025
Date:                    Fri, Nov 06 2020   R-squared (Overall):             -0.0027
Time:                            21:12:13   Log-likelihood                   -1260.9
Cov. Estimator:                 Clustered                                           
                                            F-statistic:                      716.04
Entities:                             938   P-value                           0.0000
Avg Obs:                           1.3081   Distribution:                  F(1,1225)
Min Obs:                           1.0000                                           
Max Obs:                           2.0000   F-statistic (robust):             0.5811
                                            P-value                           0.4460
Time periods:                           2   Distribution:                  F(1,1225)
Avg Obs:                           613.50                                           
Min Obs:                           575.00                                           
Max Obs:                           652.00                                           
                                                                                    
                              Parameter Estimates                               
================================================================================
              Parameter  Std. Err.     T-stat    P-value    Lower CI    Upper CI
--------------------------------------------------------------------------------
const            10.518     0.0603     174.36     0.0000      10.399      10.636
st_louis_fed     0.0721     0.0946     0.7623     0.4460     -0.1135      0.2577
================================================================================

Exercises¶

1| Suppose a non-experimental setting, where the control group differs from the treatment group. Justify if it is reasonable or not to use Difference-in-Differences (DID) to estimate a causal effect? Should you modify or add something in the DID framework?

2| Suppose a study claims based on Difference-in-Differences (DID) method that Fed avoided massive business failures via the bank bailout of 2008. Suppose another study based on Regression Discontinuity (RD) claims the opposite or denies the impact of Fed on business failures. What do you think is more credible empirical strategy DID or RD to estimate the causal impact of Fed policy? Justify you answer.

3| In a panel data, where the unit of analysis can be firm or county, what is more credible the result at firm or at county level? Justify.

4| Use the data from Ziebarth (2013) to estimate the impact of St. Louis Fed policy on firm’s revenue. Specifically, run Difference-in-Differences (DID) with Random Effects (RE). Interpret the result. What can be inferred about the unobserved factor \(\alpha_i\)?

5| Use the data from Ziebarth (2013) to estimate the impact of St. Louis Fed policy on firm’s revenue. Specifically, run Difference-in-Differences (DID) with Firm Fixed Effects (FE) without using the command “entity_effects=True”. Hint: You must use dummy variables for each firm.

Reference¶

Keynes, John Maynard. (1936). The General Theory of Employment, Interest and Money. Harcourt, Brace and Company, and printed in the U.S.A. by the Polygraphic Company of America, New York.

Richardson, Gary, and William Troost. (2009). Monetary Intervention Mitigated Banking Panics during the Great Depression: Quasi-Experimental Evidence from a Federal Reserve District Border, 1929-1933. Journal of Political Economy 117 (6): 1031-73.

Ziebarth, Nicolas L. (2013). Identifying the Effects of Bank Failures from a Natural Experiment in Mississippi during the Great Depression. American Economic Journal: Macroeconomics, 5 (1): 81-101.

Causal Inference with Python

5) Could the Federal Reserve Prevent the Great Depression?¶

Exercises¶

Reference¶