{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "5)_Could_the_Federal_Reserve_Prevent_the_Great_Depression.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.6" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "mfxvKsxwpZjN" }, "source": [ "# 5) Could the Federal Reserve Prevent the Great Depression?" ] }, { "cell_type": "markdown", "metadata": { "id": "MqhZX8XQOg0L" }, "source": [ "[Vitor Kamada](https://www.linkedin.com/in/vitor-kamada-1b73a078)\n", "\n", "E-mail: econometrics.methods@gmail.com\n", "\n", "Last updated: 10-12-2020" ] }, { "cell_type": "markdown", "metadata": { "id": "Gl65KnJ3ccug" }, "source": [ "Concerned the Great Depression, neoclassical economists believe that the decline of economic activity \"caused\" the bank failures. Keynes (1936) believed the opposite, that is, the bank insolvencies led to business failures.\n", "\n", "Richardson & Troost (2009) noticed that during the Great Depression, the state of Mississippi was divided into two districts controlled by different branches of Federal Reserve (Fed): St. Louis and Atlanta. \n", "\n", "Differently from the St. Louis Fed that made more onerous to borrow money, the Atlanta Fed adopted a Keynesian policy of discount lending and emergency liquidity to illiquid banks.\n", "\n", "Let's open the data from Ziebarth (2013). Each row is a firm from the Census of Manufactures (CoM) for 1929, 1931, 1933, and 1935." ] }, { "cell_type": "code", "metadata": { "id": "3wG1YQm6B55d", "outputId": "3a2f12a9-30e1-4320-b62c-042c46df9731", "colab": { "base_uri": "https://localhost:8080/", "height": 321 } }, "source": [ "# Load data from Ziebarth (2013)\n", "import numpy as np\n", "import pandas as pd\n", "path = \"https://github.com/causal-methods/Data/raw/master/\" \n", "data = pd.read_stata(path + \"MS_data_all_years_regs.dta\")\n", "data.head()" ], "execution_count": 1, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countyaverage_num_wage_earnersaverage_wage_aaverage_wage_baverage_wage_dbegancapacity_electric_numbercapacity_gasoline_numbercapacity_horsedrawn_numbercensusyearcitytownorvillageclerksclerks_decemberclerks_juneclerks_marchclerks_septembercost_contract_workcost_electricitycost_fuelcost_inputscost_materialscost_purchased_currentcost_purchased_electricityd_average_wage_ad_average_wage_bd_average_wage_dd_total_costd_total_output_valued_total_wagesdelivery_electric_capacitydelivery_electric_numberdelivery_gasoline_capacitydelivery_gasoline_numberdelivery_horsedrawn_capacitydelivery_horsedrawn_numberdiessel_horsepowerdiessel_numberelectricity_generatedelectricity_purchasedend..._IindXcen_66_1935_IindXcen_67_1931_IindXcen_67_1933_IindXcen_67_1935_IindXcen_68_1931_IindXcen_68_1933_IindXcen_68_1935_IindXcen_69_1931_IindXcen_69_1933_IindXcen_69_1935_IindXcen_70_1931_IindXcen_70_1933_IindXcen_70_1935_IindXcen_71_1931_IindXcen_71_1933_IindXcen_71_1935_IindXcen_72_1931_IindXcen_72_1933_IindXcen_72_1935_IindXcen_73_1931_IindXcen_73_1933_IindXcen_73_1935_IindXcen_74_1931_IindXcen_74_1933_IindXcen_74_1935_IindXcen_75_1931_IindXcen_75_1933_IindXcen_75_1935no_enter_1931no_exit_1929no_enter_1933no_exit_1931no_enter_1935no_exit_1933balanced_1931balanced_1933balanced_1935num_productsrrtsapdelta_indic
0NaNNaNNaNNaN1933NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN0.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaNNaNNaNNaNNaNNaN0.00.00.00.0NaN0.0
1Greene12.0000000.7064040.254306NaNJanuary 1, 19331933LeakesvilleNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN9.04.00.03.02.04.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNDecember 31, 1933...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaNNaNNaNNaNNaNNaN0.00.00.02.0100.7976990.0
2Hinds4.0000000.2426700.215411NaNApril 1, 19331933JacksonNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN5.03.00.04.06.03.0NaNNaNNaN7.0NaNNaNNaNNaNNaNNaNApril 1, 1934...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaNNaNNaNNaNNaNNaN0.00.00.01.0404.5260930.0
3Wayne36.0000000.0643000.099206NaNJanuary 1, 19331933BattleNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN1.01.00.05.05.06.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNDecember 31, 1933...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaNNaNNaNNaNNaNNaN0.00.00.03.0104.4976200.0
4Attala6.333333NaN0.437281NaNJanuary 1, 19331933Ethel1.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN8.00.02.03.02.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNDecember 31, 1933...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaNNaNNaNNaNNaNNaN0.00.00.01.0148.1652370.0
\n", "

5 rows × 538 columns

\n", "
" ], "text/plain": [ " county average_num_wage_earners ... rrtsap delta_indic\n", "0 NaN ... NaN 0.0\n", "1 Greene 12.000000 ... 100.797699 0.0\n", "2 Hinds 4.000000 ... 404.526093 0.0\n", "3 Wayne 36.000000 ... 104.497620 0.0\n", "4 Attala 6.333333 ... 148.165237 0.0\n", "\n", "[5 rows x 538 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 1 } ] }, { "cell_type": "markdown", "metadata": { "id": "M3DYDxZosZVT" }, "source": [ "First, we must check how similar were the two districts St. Louis and Atlanta in 1929. All variables are reported in logarithms. The mean revenue of the firms in St. Louis district was 10.88; whereas the mean revenue of the firms in Atlanta district was 10.78. Both St. Louis and Atlanta had similar wage earners (4.54 vs 4.69) and hours per worker (4.07 vs 4) as well." ] }, { "cell_type": "code", "metadata": { "id": "BYjPuYNIvo_H", "outputId": "b2712890-c7b8-4828-a943-d1f4fbfc17a1", "colab": { "base_uri": "https://localhost:8080/", "height": 171 } }, "source": [ "# Round 2 decimals\n", "pd.set_option('precision', 2)\n", "\n", "# Restrict the sample to the year: 1929\n", "df1929 = data[data.censusyear.isin([1929])]\n", "\n", "vars= ['log_total_output_value', 'log_wage_earners_total',\n", " 'log_hours_per_wage_earner']\n", "\n", "df1929.loc[:, vars].groupby(df1929[\"st_louis_fed\"]).agg([np.size, np.mean])" ], "execution_count": 14, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
log_total_output_valuelog_wage_earners_totallog_hours_per_wage_earner
sizemeansizemeansizemean
st_louis_fed
0.0424.010.78424.04.69424.04.00
1.0367.010.88367.04.54367.04.07
\n", "
" ], "text/plain": [ " log_total_output_value ... log_hours_per_wage_earner \n", " size mean ... size mean\n", "st_louis_fed ... \n", "0.0 424.0 10.78 ... 424.0 4.00\n", "1.0 367.0 10.88 ... 367.0 4.07\n", "\n", "[2 rows x 6 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 14 } ] }, { "cell_type": "markdown", "metadata": { "id": "2eatmJe5xGzR" }, "source": [ "Additionally, both St. Louis and Atlanta have a similar mean price (1.72 vs 1.55) and mean quantity (8.63 vs 8.83), if the \n", "sample is restricted to firms with 1 product. Therefore, Atlanta district is a reasonable control group for St. Louis district.\n", "\n", "\n" ] }, { "cell_type": "code", "metadata": { "id": "UbDhMPN01zJ4", "outputId": "0d63e0df-515e-4dfd-9300-6037554938b9", "colab": { "base_uri": "https://localhost:8080/", "height": 171 } }, "source": [ "# Restrict sample to firms with 1 product\n", "df1929_1 = df1929[df1929.num_products.isin([1])]\n", "\n", "per_unit = ['log_output_price_1', 'log_output_quantity_1']\n", "\n", "df1929_1.loc[:, per_unit].groupby(df1929_1[\"st_louis_fed\"]).agg([np.size, np.mean])" ], "execution_count": 3, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
log_output_price_1log_output_quantity_1
sizemeansizemean
st_louis_fed
0.0221.01.55221.08.83
1.0225.01.72225.08.63
\n", "
" ], "text/plain": [ " log_output_price_1 log_output_quantity_1 \n", " size mean size mean\n", "st_louis_fed \n", "0.0 221.0 1.55 221.0 8.83\n", "1.0 225.0 1.72 225.0 8.63" ] }, "metadata": { "tags": [] }, "execution_count": 3 } ] }, { "cell_type": "markdown", "metadata": { "id": "xJb2llljyamd" }, "source": [ "We want to see if the credit constrained policy of St. Louis Fed decreased the revenue of the firms. Or, in other words, if the Atlanta Fed saved firms from bankruptcy.\n", "\n", "For this purpose, we have to explore the time dimension: the comparison of the firm revenue before 1929 and after 1931. \n", "\n", "Let's restrict the sample to the years 1929 and 1931. Then, let's drop the missing values." ] }, { "cell_type": "code", "metadata": { "id": "6pqhZdShF5lD" }, "source": [ "# Restrict the sample to the years: 1929 and 1931\n", "df = data[data.censusyear.isin([1929, 1931])]\n", "\n", "vars = ['firmid', 'censusyear', 'log_total_output_value',\n", " 'st_louis_fed', 'industrycode', 'year_1931']\n", "\n", "# Drop missing values \n", "df = df.dropna(subset=vars)" ], "execution_count": 4, "outputs": [] }, { "cell_type": "markdown", "metadata": { "id": "tWZkNfqtFuKB" }, "source": [ "Now, we can declare a panel data structure, that is, to set the unit of analysis and the time dimension. See that the variables \"firmid\" and \"censusyear\" became indices in the table. The order matters. The first variable must be the unit of analysis and the second variable must be time unit. See in the table that the firm (id = 12) is observed for two years: 1929 and 1931.\n", "\n", "Note that panel data structure was declared after cleaning the data set. For example, if the missing values is dropped after the panel data declaration, the commands for regression will probably return errors." ] }, { "cell_type": "code", "metadata": { "id": "edqfu4h6GDC2", "outputId": "3005b86e-45bc-48db-8fc9-1ecedf208b0f", "colab": { "base_uri": "https://localhost:8080/", "height": 369 } }, "source": [ "df = df.set_index(['firmid', 'censusyear'])\n", "df.head()" ], "execution_count": 5, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countyaverage_num_wage_earnersaverage_wage_aaverage_wage_baverage_wage_dbegancapacity_electric_numbercapacity_gasoline_numbercapacity_horsedrawn_numbercitytownorvillageclerksclerks_decemberclerks_juneclerks_marchclerks_septembercost_contract_workcost_electricitycost_fuelcost_inputscost_materialscost_purchased_currentcost_purchased_electricityd_average_wage_ad_average_wage_bd_average_wage_dd_total_costd_total_output_valued_total_wagesdelivery_electric_capacitydelivery_electric_numberdelivery_gasoline_capacitydelivery_gasoline_numberdelivery_horsedrawn_capacitydelivery_horsedrawn_numberdiessel_horsepowerdiessel_numberelectricity_generatedelectricity_purchasedendenter..._IindXcen_66_1935_IindXcen_67_1931_IindXcen_67_1933_IindXcen_67_1935_IindXcen_68_1931_IindXcen_68_1933_IindXcen_68_1935_IindXcen_69_1931_IindXcen_69_1933_IindXcen_69_1935_IindXcen_70_1931_IindXcen_70_1933_IindXcen_70_1935_IindXcen_71_1931_IindXcen_71_1933_IindXcen_71_1935_IindXcen_72_1931_IindXcen_72_1933_IindXcen_72_1935_IindXcen_73_1931_IindXcen_73_1933_IindXcen_73_1935_IindXcen_74_1931_IindXcen_74_1933_IindXcen_74_1935_IindXcen_75_1931_IindXcen_75_1933_IindXcen_75_1935no_enter_1931no_exit_1929no_enter_1933no_exit_1931no_enter_1935no_exit_1933balanced_1931balanced_1933balanced_1935num_productsrrtsapdelta_indic
firmidcensusyear
121929Oktibbeha1.750.380.50NaNJanuary 1, 1929A & M CollegeNaNNaNNaNNaNNaNNaN1000.0NaNNaNNaN1000.0NaN8.08.00.07.03.04.0NaNNaNNaN1.0NaNNaNNaNNaNNaN33333.0December 31, 19290.0...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaN1.0NaNNaNNaNNaN1.00.01.01.0328.310.0
1931Oktibbeha1.750.250.30NaNJanuary 1, 1931A and M CollegeNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN6.04.00.03.03.02.0NaNNaNNaN1.0NaNNaNNaNNaNNaNNaNDecember 31, 19320.0...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.01.0NaNNaN1.0NaNNaN1.00.00.01.0NaN0.0
131929Warren7.250.490.58NaNJanuary 1, 1929ClarksburgNaNNaNNaNNaNNaNNaN956.0NaNNaNNaN956.0NaN9.09.00.08.06.08.0NaNNaNNaNNaNNaNNaNNaNNaNNaN19100.0December 31, 19290.0...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaN1.0NaNNaNNaNNaN1.00.01.02.0670.071.0
1931Warren3.50NaN0.71NaNJanuary 1, 1931Vicksburg,NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaNNaN9.00.06.07.06.0NaNNaNNaNNaNNaNNaNNaNNaNNaNNaNDecember 31, 19310.0...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.01.0NaNNaN1.0NaNNaN1.00.00.02.0NaN1.0
141929Monroe12.920.170.23NaNJanuary 1, 1929AberdeenNaNNaNNaNNaNNaNNaN828.0244.0NaNNaN828.0NaN2.02.00.08.06.07.0NaNNaN5.04.0NaNNaNNaNNaNNaN16563.0December 31, 19290.0...0.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.00.0NaN1.0NaNNaNNaNNaN1.00.01.02.0314.900.0
\n", "

5 rows × 536 columns

\n", "
" ], "text/plain": [ " county average_num_wage_earners ... rrtsap delta_indic\n", "firmid censusyear ... \n", "12 1929 Oktibbeha 1.75 ... 328.31 0.0\n", " 1931 Oktibbeha 1.75 ... NaN 0.0\n", "13 1929 Warren 7.25 ... 670.07 1.0\n", " 1931 Warren 3.50 ... NaN 1.0\n", "14 1929 Monroe 12.92 ... 314.90 0.0\n", "\n", "[5 rows x 536 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 5 } ] }, { "cell_type": "markdown", "metadata": { "id": "WLPk_wA7FKH6" }, "source": [ "Let's explain the advantages of panel data over the cross-sectional data. The last is a snapshot of one point or period of time; whereas in a panel data, the same unit of analysis is observed over time." ] }, { "cell_type": "markdown", "metadata": { "id": "shjhuMed0O7b" }, "source": [ "\n", "\n", "Let $Y_{it}$ the outcome variable of unit $i$ on time $t$. The dummy variable $d2_{it}$ is 1 for the second period, and 0 for the first period. Note that the explanatory variable $X_{it}$ varies over unit $i$ and time $t$, but the unobserved factor $\\alpha_i$ doesn't vary over the time. Unobserved factor is an unavailable variable (data) that might be correlated with the variable of interest, generating bias in the results. \n", "\n", "$$Y_{it}=\\beta_0+\\delta_0d2_{it}+\\beta_1X_{it}+\\alpha_i+\\epsilon_{it}$$\n", "\n", "The advantage of exploring the time variation is that the unobserved factor $\\alpha_i$ can be eliminated by a First-Difference (FD) method.\n", "\n", "In the second period ($t=2$), the time dummy $d2=1$:\n", "\n", "$$Y_{i2}=\\beta_0+\\delta_0+\\beta_1X_{i2}+\\alpha_i+\\epsilon_{i2}$$\n", "\n", "In the first period ($t=1$), the time dummy $d2=0$:\n", "\n", "$$Y_{i1}=\\beta_0+\\beta_1X_{i1}+\\alpha_i+\\epsilon_{i1}$$\n", "\n", "Then:\n", "\n", "$$Y_{i2}-Y_{i1}$$\n", "\n", "$$=\\delta_0+\\beta_1(X_{i2}-X_{i1})+\\epsilon_{i2}-\\epsilon_{i1}$$\n", "\n", "$$\\Delta Y_i=\\delta_0+\\beta_1\\Delta X_i+\\Delta \\epsilon_i$$\n", "\n", "Therefore, if the same units are observed over time (panel data), no need to worry about a factor that can be considered constant over the time analyzed. We can assume that the company culture and institutional practices don't vary much over a short period of time. These factors are likely to explain the difference in revenue among the firms but will not bias the result if the assumption above is correct.\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "FH5OhBVJKKH7" }, "source": [ "Let's install the library that can run the panel data regressions." ] }, { "cell_type": "code", "metadata": { "id": "9KGobKkfKK66", "outputId": "1ebb65c0-a631-4564-ea37-1fa55e057710", "colab": { "base_uri": "https://localhost:8080/", "height": 349 } }, "source": [ "!pip install linearmodels" ], "execution_count": 6, "outputs": [ { "output_type": "stream", "text": [ "Collecting linearmodels\n", "\u001b[?25l Downloading https://files.pythonhosted.org/packages/33/98/9606898621df26cad70021f928ab977926a7ed6ad30a10572cc93f67a970/linearmodels-4.17-cp36-cp36m-manylinux1_x86_64.whl (1.5MB)\n", "\u001b[K |████████████████████████████████| 1.5MB 2.8MB/s \n", "\u001b[?25hCollecting mypy-extensions>=0.4\n", " Downloading https://files.pythonhosted.org/packages/5c/eb/975c7c080f3223a5cdaff09612f3a5221e4ba534f7039db34c35d95fa6a5/mypy_extensions-0.4.3-py2.py3-none-any.whl\n", "Requirement already satisfied: numpy>=1.15 in /usr/local/lib/python3.6/dist-packages (from linearmodels) (1.18.5)\n", "Requirement already satisfied: scipy>=1 in /usr/local/lib/python3.6/dist-packages (from linearmodels) (1.4.1)\n", "Collecting property-cached>=1.6.3\n", " Downloading https://files.pythonhosted.org/packages/5c/6c/94d8e520b20a2502e508e1c558f338061cf409cbee78fd6a3a5c6ae812bd/property_cached-1.6.4-py2.py3-none-any.whl\n", "Requirement already satisfied: Cython>=0.29.14 in /usr/local/lib/python3.6/dist-packages (from linearmodels) (0.29.21)\n", "Requirement already satisfied: patsy in /usr/local/lib/python3.6/dist-packages (from linearmodels) (0.5.1)\n", "Requirement already satisfied: pandas>=0.23 in /usr/local/lib/python3.6/dist-packages (from linearmodels) (1.1.2)\n", "Requirement already satisfied: statsmodels>=0.9 in /usr/local/lib/python3.6/dist-packages (from linearmodels) (0.10.2)\n", "Requirement already satisfied: six in /usr/local/lib/python3.6/dist-packages (from patsy->linearmodels) (1.15.0)\n", "Requirement already satisfied: pytz>=2017.2 in /usr/local/lib/python3.6/dist-packages (from pandas>=0.23->linearmodels) (2018.9)\n", "Requirement already satisfied: python-dateutil>=2.7.3 in /usr/local/lib/python3.6/dist-packages (from pandas>=0.23->linearmodels) (2.8.1)\n", "Installing collected packages: mypy-extensions, property-cached, linearmodels\n", "Successfully installed linearmodels-4.17 mypy-extensions-0.4.3 property-cached-1.6.4\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "xu1t_P8S_lI4" }, "source": [ "Let's use the Difference-in-Differences (DID) method to estimate the impact of St. Louis Fed policy on firm revenue. In addition to explore the time difference, the treatment-control difference must be used to estimate the causal impact of the policy.\n", "\n", "Let $Y$ be the outcome variable 'log_total_output_value', $d2$ the time dummy variable 'year_1931', $dT$ the treatment dummy variable 'st_louis_fed', and $d2 \\cdot dT$ the interaction term between the previous two dummies:\n", "\n", "$$Y = \\beta_0+\\delta_0d2+\\beta_1dT+\\delta_1 (d2\\cdot dT)+ \\epsilon$$\n", "\n", "The DID estimator is given by $\\delta_1$ and not by $\\beta_1$ or $\\delta_0$. First, we take the difference between \"Treatment (St. Louis)\" and \"Control (Atlanta)\", and then we take the difference between \"After (1931)\" and \"Before (1921)\". \n", "\n", "$$\\hat{\\delta}_1 = (\\bar{y}_{2,T}-\\bar{y}_{2,C})-(\\bar{y}_{1,T}-\\bar{y}_{1,C})$$\n", "\n", "The order doesn't matter. If we take first the difference between \"After (1931)\" and \"Before (1921)\", and then the difference between \"Treatment (St. Louis)\" and \"Control (Atlanta)\", the result will be the same $\\delta_1$. \n", "\n", "$$\\hat{\\delta}_1 = (\\bar{y}_{2,T}-\\bar{y}_{1,T})-(\\bar{y}_{2,C}-\\bar{y}_{1,C})$$\n", "\n", "Let's show formally that the we must take the difference twice in the DID estimator $\\delta_0$: \n", "\n", "If $d2=0$ and $dT=0$, then $Y_{0,0}=\\beta_0$.\n", "\n", "If $d2=1$ and $dT=0$, then $Y_{1,0}=\\beta_0+\\delta_0$.\n", "\n", "For the control group, the difference \"After - Before\" is:\n", "\n", "$$Y_{1,0}-Y_{0,0}=\\delta_0$$\n", "\n", "Let's apply the same reasoning to the treatment group:\n", "\n", "If $d2=0$ and $dT=1$, then $Y_{0,1}=\\beta_0 + \\beta_1$.\n", "\n", "If $d2=1$ and $dT=1$, then $Y_{1,1}=\\beta_0+\\delta_0+ \\beta_1+\\delta_1$.\n", "\n", "For the treatment group, the difference \"After - Before\" is:\n", "\n", "$$Y_{1,1}-Y_{0,1}=\\delta_0+\\delta_1$$\n", "\n", "Then, if we take the the difference \"Treatment - Control\", we get:\n", "\n", "$$\\delta_1$$\n", "\n", "![alt text](https://github.com/causal-methods/Data/raw/master/figures/DiffDiff.PNG)" ] }, { "cell_type": "markdown", "metadata": { "id": "RHJwBqaMJzjU" }, "source": [ "Let's manually calculate the $\\hat{\\delta}_1$ from the numbers in the graphic \"Firm's Revenue during the Great Depression\". Note that in the Difference-in-Differences (DID) method, a counterfactual is constructed based on the control group (Atlanta). It is just a parallel shift of Atlanta line. The counterfactual is the hypothetical outcome for the treatment group (St. Louis), if St. Louis Fed had followed the same policy of Atlanta Fed.\n", "\n", "$$\\hat{\\delta}_1 = (\\bar{y}_{2,T}-\\bar{y}_{2,C})-(\\bar{y}_{1,T}-\\bar{y}_{1,C})$$\n", "\n", "$$=(10.32-10.42)-(10.87-10.78)$$\n", "\n", "$$=(-0.1)-(-0.1)$$\n", "\n", "$$=-0.2$$\n", "\n", "The restrictive credit policy of St. Louis Fed decreased in about 20% the revenue of the firms. The result of a simple mean comparison in the end of 1931 is only -10%. Therefore, without using the counterfactual reasoning, the negative impact of the St. Louis Fed policy would be large underestimated." ] }, { "cell_type": "code", "metadata": { "id": "BiE32oI0Jzzc", "outputId": "b8018128-be89-4f96-c881-129eb34c0b26", "colab": { "base_uri": "https://localhost:8080/", "height": 542 } }, "source": [ "# Mean Revenue for the Graphic\n", "table = pd.crosstab(df['year_1931'], df['st_louis_fed'], \n", " values=df['log_total_output_value'], aggfunc='mean')\n", "\n", "# Build Graphic\n", "import plotly.graph_objects as go\n", "fig = go.Figure()\n", "\n", "# x axis\n", "year = [1929, 1931]\n", "\n", "# Atlanta Line\n", "fig.add_trace(go.Scatter(x=year, y=table[0],\n", " name='Atlanta (Control)'))\n", "# St. Louis Line\n", "fig.add_trace(go.Scatter(x=year, y=table[1],\n", " name='St. Louis (Treatment)'))\n", "# Counterfactual\n", "end_point = (table[1][0] - table[0][0]) + table[0][1]\n", "counter = [table[1][0], end_point]\n", "fig.add_trace(go.Scatter(x=year, y= counter,\n", " name='Counterfactual',\n", " line=dict(dash='dot') ))\n", "\n", "# Difference-in-Differences (DID) estimation\n", "fig.add_trace(go.Scatter(x=[1931, 1931],\n", " y=[table[1][1], end_point],\n", " name='$\\delta_1=0.2$',\n", " line=dict(dash='dashdot') ))\n", "\n", "# Labels\n", "fig.update_layout(title=\"Firm's Revenue during the Great Depression\",\n", " xaxis_type='category',\n", " xaxis_title='Year',\n", " yaxis_title='Log(Revenue)')\n", "\n", "fig.show()" ], "execution_count": 7, "outputs": [ { "output_type": "display_data", "data": { "text/html": [ "\n", "\n", "\n", "
\n", " \n", " \n", " \n", "
\n", " \n", "
\n", "\n", "" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "tqMphhD2sKNp" }, "source": [ "The result of Difference-in-Differences (DID) implemented via regression is:\n", "\n", "$$\\hat{Y} = 10.8-0.35d2+0.095dT-0.20(d2\\cdot dT)$$" ] }, { "cell_type": "code", "metadata": { "id": "oCCHm_TR_QP7", "outputId": "309b9478-3e26-4a68-e365-55c57e76abf3", "colab": { "base_uri": "https://localhost:8080/", "height": 676 } }, "source": [ "from linearmodels import PanelOLS\n", "\n", "Y = df['log_total_output_value']\n", "df['const'] = 1\n", "df['louis_1931'] = df['st_louis_fed']*df['year_1931']\n", "\n", "## Difference-in-Differences (DID) specification\n", "dd = ['const', 'st_louis_fed', 'year_1931', 'louis_1931']\n", "\n", "dif_in_dif = PanelOLS(Y, df[dd]).fit(cov_type='clustered',\n", " cluster_entity=True)\n", "print(dif_in_dif)" ], "execution_count": 8, "outputs": [ { "output_type": "stream", "text": [ "/usr/local/lib/python3.6/dist-packages/statsmodels/tools/_testing.py:19: FutureWarning:\n", "\n", "pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.\n", "\n", "/usr/local/lib/python3.6/dist-packages/linearmodels/panel/data.py:98: FutureWarning:\n", "\n", "is_categorical is deprecated and will be removed in a future version. Use is_categorical_dtype instead\n", "\n" ], "name": "stderr" }, { "output_type": "stream", "text": [ " PanelOLS Estimation Summary \n", "====================================================================================\n", "Dep. Variable: log_total_output_value R-squared: 0.0257\n", "Estimator: PanelOLS R-squared (Between): 0.0145\n", "No. Observations: 1227 R-squared (Within): 0.2381\n", "Date: Tue, Oct 13 2020 R-squared (Overall): 0.0257\n", "Time: 00:31:36 Log-likelihood -2135.5\n", "Cov. Estimator: Clustered \n", " F-statistic: 10.761\n", "Entities: 938 P-value 0.0000\n", "Avg Obs: 1.3081 Distribution: F(3,1223)\n", "Min Obs: 1.0000 \n", "Max Obs: 2.0000 F-statistic (robust): 18.780\n", " P-value 0.0000\n", "Time periods: 2 Distribution: F(3,1223)\n", "Avg Obs: 613.50 \n", "Min Obs: 575.00 \n", "Max Obs: 652.00 \n", " \n", " Parameter Estimates \n", "================================================================================\n", " Parameter Std. Err. T-stat P-value Lower CI Upper CI\n", "--------------------------------------------------------------------------------\n", "const 10.781 0.0723 149.13 0.0000 10.639 10.923\n", "st_louis_fed 0.0945 0.1043 0.9057 0.3653 -0.1102 0.2991\n", "year_1931 -0.3521 0.0853 -4.1285 0.0000 -0.5194 -0.1848\n", "louis_1931 -0.1994 0.1237 -1.6112 0.1074 -0.4422 0.0434\n", "================================================================================\n", "\n", "\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "tSkhdfsp1Ngw" }, "source": [ "The St. Louis Fed policy decreased the firm revenue in 18% ($1-e^{-0.1994}$). However, the p-value is 0.1074. The result is not statistically significant at 10%. " ] }, { "cell_type": "code", "metadata": { "id": "CznHf-EKtGKs", "outputId": "530f1aba-dbc6-4925-bcb3-1e00677cbcb4", "colab": { "base_uri": "https://localhost:8080/", "height": 35 } }, "source": [ "from math import exp\n", "1 - exp(dif_in_dif.params.louis_1931 )" ], "execution_count": 9, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "0.1807630946400488" ] }, "metadata": { "tags": [] }, "execution_count": 9 } ] }, { "cell_type": "markdown", "metadata": { "id": "SlqIf5ru3Hpd" }, "source": [ "Somebody might argue that the difference among firms is a confound factor. One or another big firm might bias the results.\n", "\n", "This issue can be addressed by using Fixed Effects (FE) or Within Estimator. The technique is similar to the First-Difference (FD), but with different data transformation. The time-demeaned process is used to eliminate the unobserved factor $\\alpha_i$. \n", "\n", "$$Y_{it}=\\beta X_{it}+\\alpha_i+\\epsilon_{it}$$\n", "\n", "Let's average the variables for each $i$ over time $t$:\n", "\n", "$$\\bar{Y}_{i}=\\beta \\bar{X}_{i}+\\alpha_i+\\bar{\\epsilon}_{i}$$\n", "\n", "Then, we take the difference and the unobserved factor $\\alpha_i$ vanishes:\n", "\n", "$$Y_{it}-\\bar{Y}_{i}=\\beta (X_{it}-\\bar{X}_{i})+\\epsilon_{it}-\\bar{\\epsilon}_{i}$$\n", "\n", "We can write the equation above in a more compact way:\n", "\n", "$$\\ddot{Y}_{it}=\\beta \\ddot{X}_{it}+\\ddot{\\epsilon}_{it}$$\n", "\n", "As we declared previously that the firm is the unit of analysis in this panel data set, the computer implements the Firm Fixed Effects (FE) automatically with the command \"entity_effects=True\".\n", "\n", "We added Firm Fixed Effects (FE) to the Difference-in-Differences (DID) specification and the result didn't change much. The intuition is that Difference-in-Differences (DID) technique had already mitigated the endogeneity problems.\n", "\n", "The St. Louis Fed policy decreased the firm revenue in 17% ($1-e^{-0.1862}$). The result is statistically significant at 10%.\n", "\n", "\n" ] }, { "cell_type": "code", "metadata": { "id": "0W86zaONAWZr", "outputId": "e9226e85-f8cb-4c98-e988-9ad6a2afcdee", "colab": { "base_uri": "https://localhost:8080/", "height": 676 } }, "source": [ "firmFE = PanelOLS(Y, df[dd], entity_effects=True)\n", "print(firmFE.fit(cov_type='clustered', cluster_entity=True))" ], "execution_count": 10, "outputs": [ { "output_type": "stream", "text": [ " PanelOLS Estimation Summary \n", "====================================================================================\n", "Dep. Variable: log_total_output_value R-squared: 0.2649\n", "Estimator: PanelOLS R-squared (Between): -0.4554\n", "No. Observations: 1227 R-squared (Within): 0.2649\n", "Date: Tue, Oct 13 2020 R-squared (Overall): -0.4266\n", "Time: 00:31:36 Log-likelihood -202.61\n", "Cov. Estimator: Clustered \n", " F-statistic: 34.361\n", "Entities: 938 P-value 0.0000\n", "Avg Obs: 1.3081 Distribution: F(3,286)\n", "Min Obs: 1.0000 \n", "Max Obs: 2.0000 F-statistic (robust): 31.245\n", " P-value 0.0000\n", "Time periods: 2 Distribution: F(3,286)\n", "Avg Obs: 613.50 \n", "Min Obs: 575.00 \n", "Max Obs: 652.00 \n", " \n", " Parameter Estimates \n", "================================================================================\n", " Parameter Std. Err. T-stat P-value Lower CI Upper CI\n", "--------------------------------------------------------------------------------\n", "const 9.9656 0.4511 22.091 0.0000 9.0777 10.854\n", "st_louis_fed 1.9842 1.0365 1.9144 0.0566 -0.0559 4.0243\n", "year_1931 -0.3666 0.0657 -5.5843 0.0000 -0.4959 -0.2374\n", "louis_1931 -0.1862 0.0982 -1.8965 0.0589 -0.3794 0.0071\n", "================================================================================\n", "\n", "F-test for Poolability: 6.8220\n", "P-value: 0.0000\n", "Distribution: F(937,286)\n", "\n", "Included effects: Entity\n" ], "name": "stdout" }, { "output_type": "stream", "text": [ "/usr/local/lib/python3.6/dist-packages/linearmodels/panel/data.py:98: FutureWarning:\n", "\n", "is_categorical is deprecated and will be removed in a future version. Use is_categorical_dtype instead\n", "\n" ], "name": "stderr" } ] }, { "cell_type": "markdown", "metadata": { "id": "dimpuQdxHSqb" }, "source": [ "The Fixed Effects (FE) can be manually implemented by adding dummy variables. There are different Fixed Effects. Let's add Industry Fixed Effects to the Difference-in-Differences (DID) specification to discard the possibility that the results might be driven by Industry specific shocks.\n", "\n", "The St. Louis Fed policy decreased the firm revenue in 14.2% ($1-e^{-0.1533}$). The result is statistically significant at 10%.\n", "\n", "Why not add Firm and Industry Fixed Effects in the same time? It is possible and recommendable, but the computer will not return any result given the problem of multicollinearity. We have only two observations (2 years) per firm. If we add one dummy variable for each firm, it is like to run a regression with more variables than observations.\n", "\n", "In his paper Ziebarth (2013) presents results using Firm and Industry Fixed Effects, how is it possible? Ziebarth (2013) used Stata software. Stata automatically drops some variables in the case of multicollinearity problem and outputs a result. Although this practice is well-diffused in Top Journals of Economics, it is not the \"true\" Fixed Effects." ] }, { "cell_type": "code", "metadata": { "id": "fWCtMdyg2IYF", "outputId": "bb7d9f47-af34-497b-e8f5-924317c316ae", "colab": { "base_uri": "https://localhost:8080/", "height": 1000 } }, "source": [ "industryFE = PanelOLS(Y, df[dd + ['industrycode']])\n", "print(industryFE.fit(cov_type='clustered', cluster_entity=True))" ], "execution_count": 11, "outputs": [ { "output_type": "stream", "text": [ "/usr/local/lib/python3.6/dist-packages/linearmodels/panel/data.py:98: FutureWarning:\n", "\n", "is_categorical is deprecated and will be removed in a future version. Use is_categorical_dtype instead\n", "\n" ], "name": "stderr" }, { "output_type": "stream", "text": [ " PanelOLS Estimation Summary \n", "====================================================================================\n", "Dep. Variable: log_total_output_value R-squared: 0.5498\n", "Estimator: PanelOLS R-squared (Between): 0.5462\n", "No. Observations: 1227 R-squared (Within): 0.3913\n", "Date: Tue, Oct 13 2020 R-squared (Overall): 0.5498\n", "Time: 00:31:36 Log-likelihood -1661.9\n", "Cov. Estimator: Clustered \n", " F-statistic: 29.971\n", "Entities: 938 P-value 0.0000\n", "Avg Obs: 1.3081 Distribution: F(48,1178)\n", "Min Obs: 1.0000 \n", "Max Obs: 2.0000 F-statistic (robust): -2.154e+16\n", " P-value 1.0000\n", "Time periods: 2 Distribution: F(48,1178)\n", "Avg Obs: 613.50 \n", "Min Obs: 575.00 \n", "Max Obs: 652.00 \n", " \n", " Parameter Estimates \n", "=====================================================================================\n", " Parameter Std. Err. T-stat P-value Lower CI Upper CI\n", "-------------------------------------------------------------------------------------\n", "const 10.391 0.2249 46.201 0.0000 9.9497 10.832\n", "st_louis_fed -0.1740 0.0805 -2.1610 0.0309 -0.3319 -0.0160\n", "year_1931 -0.3163 0.0660 -4.7922 0.0000 -0.4458 -0.1868\n", "louis_1931 -0.1533 0.0916 -1.6741 0.0944 -0.3331 0.0264\n", "industrycode.1005 -0.1966 0.3637 -0.5407 0.5888 -0.9101 0.5169\n", "industrycode.101 0.2349 0.2403 0.9775 0.3285 -0.2366 0.7064\n", "industrycode.1014 -0.1667 1.0812 -0.1542 0.8775 -2.2880 1.9546\n", "industrycode.103 1.5898 0.2938 5.4104 0.0000 1.0133 2.1663\n", "industrycode.104 1.1349 0.2818 4.0280 0.0001 0.5821 1.6877\n", "industrycode.105 0.9660 0.4191 2.3049 0.0213 0.1437 1.7882\n", "industrycode.107 0.9374 0.3073 3.0502 0.0023 0.3344 1.5404\n", "industrycode.110 0.2299 0.3447 0.6671 0.5049 -0.4463 0.9062\n", "industrycode.111 2.8271 0.4097 6.9010 0.0000 2.0233 3.6308\n", "industrycode.112 0.2676 0.4185 0.6394 0.5227 -0.5535 1.0888\n", "industrycode.114 2.7559 0.4623 5.9612 0.0000 1.8489 3.6630\n", "industrycode.116 0.4498 0.4351 1.0337 0.3015 -0.4039 1.3035\n", "industrycode.117 -0.1337 0.5436 -0.2461 0.8057 -1.2002 0.9327\n", "industrycode.118 0.1025 0.2483 0.4127 0.6799 -0.3847 0.5896\n", "industrycode.119 -0.3121 0.2336 -1.3360 0.1818 -0.7705 0.1462\n", "industrycode.1204 -0.0102 0.3031 -0.0338 0.9731 -0.6048 0.5844\n", "industrycode.123 0.2379 0.7093 0.3354 0.7374 -1.1537 1.6295\n", "industrycode.126 1.8847 0.4828 3.9039 0.0001 0.9375 2.8319\n", "industrycode.128 -0.0025 0.5277 -0.0047 0.9963 -1.0377 1.0328\n", "industrycode.1303 2.6410 0.2293 11.519 0.0000 2.1912 3.0908\n", "industrycode.136 0.2537 0.2493 1.0178 0.3090 -0.2354 0.7428\n", "industrycode.1410 -1.4091 0.2967 -4.7496 0.0000 -1.9912 -0.8270\n", "industrycode.1502 1.4523 0.3858 3.7644 0.0002 0.6954 2.2093\n", "industrycode.1604 -0.5652 0.4195 -1.3473 0.1781 -1.3882 0.2579\n", "industrycode.1624 -1.0476 1.0108 -1.0365 0.3002 -3.0307 0.9355\n", "industrycode.1640 -0.8320 0.3613 -2.3030 0.0215 -1.5409 -0.1232\n", "industrycode.1676 0.5830 0.2512 2.3205 0.0205 0.0901 1.0759\n", "industrycode.1677 -0.7025 0.2346 -2.9947 0.0028 -1.1627 -0.2422\n", "industrycode.203 -0.6435 0.2241 -2.8712 0.0042 -1.0832 -0.2038\n", "industrycode.210B 1.7535 0.2293 7.6482 0.0000 1.3037 2.2033\n", "industrycode.216 2.7641 0.3549 7.7888 0.0000 2.0679 3.4604\n", "industrycode.234a 1.6871 0.5410 3.1187 0.0019 0.6258 2.7485\n", "industrycode.265 1.2065 0.4193 2.8774 0.0041 0.3838 2.0292\n", "industrycode.266 -0.3244 0.2736 -1.1857 0.2360 -0.8612 0.2124\n", "industrycode.304 1.2156 0.4982 2.4400 0.0148 0.2382 2.1930\n", "industrycode.314 0.8008 0.2411 3.3211 0.0009 0.3277 1.2740\n", "industrycode.317 -0.1470 0.2750 -0.5347 0.5930 -0.6866 0.3925\n", "industrycode.515 -0.4623 0.2862 -1.6153 0.1065 -1.0238 0.0992\n", "industrycode.518 -0.5243 0.2887 -1.8157 0.0697 -1.0908 0.0422\n", "industrycode.520 -0.2234 0.2731 -0.8179 0.4136 -0.7592 0.3124\n", "industrycode.614 1.8029 0.3945 4.5707 0.0000 1.0290 2.5768\n", "industrycode.622 2.9239 0.2430 12.035 0.0000 2.4473 3.4006\n", "industrycode.633 0.0122 1.5690 0.0078 0.9938 -3.0661 3.0906\n", "industrycode.651 1.9170 0.8690 2.2060 0.0276 0.2121 3.6220\n", "industrycode.652 2.5495 0.2211 11.531 0.0000 2.1158 2.9833\n", "=====================================================================================\n", "\n", "\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "_HivLLv1QFVJ" }, "source": [ "Just for exercise purpose, suppose that the unobserved factor $\\alpha_i$ is ignored. This assumption is called Random Effects (RE). In this case, $\\alpha_i$ will be inside the error term $v_{it}$ and potentially biased the results.\n", "\n", "$$Y_{it}=\\beta X_{it}+v_{it}$$\n", "\n", "$$v_{it}= \\alpha_i+\\epsilon_{it}$$\n", "\n", "In an experiment, the treatment variable is uncorrelated with the unobserved factor $\\alpha_i$. In this case, Random Effects (RE) model has the advantage of producing lower standard errors than the Fixed Effects models.\n", "\n", "Note that if we run a simple Random Effects (RE) regression, we might conclude wrongly that St. Louis Fed policy increased the firm revenue in 7%." ] }, { "cell_type": "code", "metadata": { "id": "dpZ95aJQEjTj", "outputId": "a6c755b5-baf7-4ef0-eff1-abdfdea380a0", "colab": { "base_uri": "https://localhost:8080/", "height": 537 } }, "source": [ "from linearmodels import RandomEffects\n", "re = RandomEffects(Y, df[['const', 'st_louis_fed']])\n", "print(re.fit(cov_type='clustered', cluster_entity=True))" ], "execution_count": 12, "outputs": [ { "output_type": "stream", "text": [ " RandomEffects Estimation Summary \n", "====================================================================================\n", "Dep. Variable: log_total_output_value R-squared: 0.3689\n", "Estimator: RandomEffects R-squared (Between): -0.0001\n", "No. Observations: 1227 R-squared (Within): 0.0025\n", "Date: Tue, Oct 13 2020 R-squared (Overall): -0.0027\n", "Time: 00:31:36 Log-likelihood -1260.9\n", "Cov. Estimator: Clustered \n", " F-statistic: 716.04\n", "Entities: 938 P-value 0.0000\n", "Avg Obs: 1.3081 Distribution: F(1,1225)\n", "Min Obs: 1.0000 \n", "Max Obs: 2.0000 F-statistic (robust): 0.5811\n", " P-value 0.4460\n", "Time periods: 2 Distribution: F(1,1225)\n", "Avg Obs: 613.50 \n", "Min Obs: 575.00 \n", "Max Obs: 652.00 \n", " \n", " Parameter Estimates \n", "================================================================================\n", " Parameter Std. Err. T-stat P-value Lower CI Upper CI\n", "--------------------------------------------------------------------------------\n", "const 10.518 0.0603 174.36 0.0000 10.399 10.636\n", "st_louis_fed 0.0721 0.0946 0.7623 0.4460 -0.1135 0.2577\n", "================================================================================\n" ], "name": "stdout" }, { "output_type": "stream", "text": [ "/usr/local/lib/python3.6/dist-packages/linearmodels/panel/data.py:98: FutureWarning:\n", "\n", "is_categorical is deprecated and will be removed in a future version. Use is_categorical_dtype instead\n", "\n" ], "name": "stderr" } ] }, { "cell_type": "markdown", "metadata": { "id": "_LhtUSOZE5s3" }, "source": [ "## Exercises" ] }, { "cell_type": "markdown", "metadata": { "id": "RDRjlLenk4cv" }, "source": [ "1| Suppose a non-experimental setting, where the control group differs from the treatment group. Justify if it is reasonable or not to use Difference-in-Differences (DID) to estimate a causal effect? Should you modify or add something in the DID framework?\n" ] }, { "cell_type": "markdown", "metadata": { "id": "pMoxzoJqrmnA" }, "source": [ "2| Suppose a study claims based on Difference-in-Differences (DID) method that Fed avoided massive business failures via the bank bailout of 2008. Suppose another study based on Regression Discontinuity (RD) claims the opposite or denies the impact of Fed on business failures. What do you think is more credible empirical strategy DID or RD to estimate the causal impact of Fed policy? Justify you answer.\n" ] }, { "cell_type": "markdown", "metadata": { "id": "0b7GlpeOo-Eq" }, "source": [ "3| In a panel data, where the unit of analysis can be firm or county, what is more credible the result at firm or at county level? Justify." ] }, { "cell_type": "markdown", "metadata": { "id": "us8Ii1dzk4n0" }, "source": [ "4| Use the data from Ziebarth (2013) to estimate the impact of St. Louis Fed policy on firm's revenue. Specifically, run Difference-in-Differences (DID) with Random Effects (RE). Interpret the result. What can be inferred about the unobserved factor $\\alpha_i$? " ] }, { "cell_type": "markdown", "metadata": { "id": "1nPGFpyul--6" }, "source": [ "5| Use the data from Ziebarth (2013) to estimate the impact of St. Louis Fed policy on firm's revenue. Specifically, run Difference-in-Differences (DID) with Firm Fixed Effects (FE) without using the command \"entity_effects=True\". Hint: You must use dummy variables for each firm." ] }, { "cell_type": "markdown", "metadata": { "id": "oF_8ckcyfmiP" }, "source": [ "## Reference" ] }, { "cell_type": "markdown", "metadata": { "id": "ZOwX1_TNpbMt" }, "source": [ "Keynes, John Maynard. (1936). [The General Theory of Employment, Interest and Money](https://www.marxists.org/reference/subject/economics/keynes/general-theory/). Harcourt, Brace and Company, and printed in the U.S.A. by the Polygraphic Company of America, New York. \n", "\n", "Richardson, Gary, and William Troost. (2009). [Monetary Intervention Mitigated Banking Panics during the Great Depression: Quasi-Experimental Evidence from a Federal Reserve District Border, 1929-1933](https://github.com/causal-methods/Papers/raw/master/richardson_troost_2009_jpe.pdf). Journal of Political Economy 117 (6): 1031-73. \n", "\n", "Ziebarth, Nicolas L. (2013). [Identifying the Effects of Bank Failures from a Natural Experiment in Mississippi during the Great Depression](https://github.com/causal-methods/Papers/raw/master/Identifying%20the%20Effects%20of%20Bank%20Failures.pdf). American Economic Journal: Macroeconomics, 5 (1): 81-101. " ] } ] }