{ "nbformat": 4, "nbformat_minor": 0, "metadata": { "colab": { "name": "8) Do Hosts Discriminate against Black Guests in Airbnb?.ipynb", "provenance": [] }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.9" } }, "cells": [ { "cell_type": "markdown", "metadata": { "id": "mfxvKsxwpZjN" }, "source": [ "# 8) Do Hosts Discriminate against Black Guests in Airbnb?" ] }, { "cell_type": "markdown", "metadata": { "id": "kRSC3oJZQSQ2" }, "source": [ "[Vitor Kamada](https://www.linkedin.com/in/vitor-kamada-1b73a078)\n", "\n", "E-mail: econometrics.methods@gmail.com\n", "\n", "Last updated: 11-1-2020" ] }, { "cell_type": "markdown", "metadata": { "id": "nLMyEL1yszwC" }, "source": [ "Edelman et al. (2017) found that Black sounding-names are 16% less likely to be accepted as a guest in Airbnb than White sounding-names. This result is not a mere correlation. The variable race was randomized. The only difference between Blacks and Whites is the name. For everything else, Black and White guests are the same." ] }, { "cell_type": "markdown", "metadata": { "id": "Z0l_mfBDpnVh" }, "source": [ "Let's open the dataset of Edelman et al. (2017). Each row is a property of Airbnb in July 2015. The sample is composed of all properties in Baltimore, Dallas, Los Angeles, St. Louis, and Washington, DC." ] }, { "cell_type": "code", "metadata": { "id": "3wG1YQm6B55d", "outputId": "7480241c-8bec-4e39-fb8f-0cd60265d825", "colab": { "base_uri": "https://localhost:8080/", "height": 425 } }, "source": [ "import numpy as np\n", "import pandas as pd\n", "pd.set_option('precision', 3)\n", "\n", "# Data from Edelman et al. (2017)\n", "path = \"https://github.com/causal-methods/Data/raw/master/\" \n", "df = pd.read_csv(path + \"Airbnb.csv\")\n", "df.head(5)" ], "execution_count": 1, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
host_responseresponse_datenumber_of_messagesautomated_codinglatitudelongitudebed_typeproperty_typecancellation_policynumber_guestsbedroomsbathroomscleaning_feepriceapt_ratingproperty_setupcitydate_sentlisting_downnumber_of_listingsnumber_of_reviewsmember_sinceverified_idhost_racesuper_hosthost_genderhost_agehost_gender_1host_gender_2host_gender_3host_race_1host_race_2host_race_3guest_first_nameguest_last_nameguest_raceguest_genderguest_idpopulationwhites...host_gender_FFhost_gender_Mhost_gender_MMhost_gender_MFhost_gender_same_sexhost_age_catten_reviewsfive_star_propertymultiple_listingsshared_propertyshared_bathroomhas_cleaning_feestrict_cancellationyoungmiddleoldpriceyprice_medianlog_pricewhite_proportionblack_proportionasian_proportionhispanic_proportiontract_listingslog_tract_listingssimplified_host_responsegraph_binsyesbaltimoredallaslos_angelessldctotal_guestsraw_blackprop_blackany_blackpast_guest_mergefilled_septemberpr_filled
0Yes2015-07-19 08:26:172.01.034.081-118.270Real BedHouseFlexible3.03.03.030.099.05.0Private RoomLos-Angeles2015-07-19 01:34:000.01.08.0March 20081.0whiteNaNMyoung/middleMM.whitewhite.BradWalshwhitemale6.03340.01789.0...010001.00101010010004.5950.5360.0300.1450.557162.773YesYes1.00010011.00.00.00.0matched (3)10.412
1No or unavailable2015-07-14 14:13:39NaN1.038.911-77.020NaNHouseModerate2.02.02.0NaN125.05.0Private RoomWashington2015-07-14 09:53:000.03.0185.0September 20081.0hispNaNFyoungFFFwhitehisphispBradWalshwhitemale6.02143.0847.0...000000.01111000100014.8280.3950.4480.0570.089192.944NoNo0.000001167.00.00.00.0matched (3)10.686
2Request for more info (Can you verify? How man...2015-07-20 16:24:082.00.034.005-118.481Pull-out SofaApartmentStrict1.01.01.0100.0135.05.0Private RoomLos-Angeles2015-07-20 11:25:000.02.020.0September 20080.0whiteNaNFmiddle/youngFF.whitewhite.BradWalshwhitemale6.05700.04648.0...000001.01111111010014.9050.8150.0460.0540.119213.045Requests more informationConditional No0.00010019.00.00.00.0matched (3)00.331
3I will get back to you2015-07-20 06:47:38NaN0.034.092-118.282NaNHouseStrict8.08.08.0115.0319.05.0Entire PlaceLos-Angeles2015-07-20 02:44:000.01.042.0September 20081.0whiteNaNmixmiddleMmixmixwhitewhitemultTanishaJacksonblackfemale15.02235.01393.0...000002.01100011010115.7650.6230.0430.1090.381112.398Not sure or check laterConditional No0.00010041.00.00.00.0matched (3)00.536
4Message not sent.NaN1.038.830-76.897Real BedHouseStrict2.02.02.035.040.05.0Private RoomWashington.0.01.037.0October 20080.0multNaNFFmiddle/youngFFFF.multmult.LakishaJonesblackfemale11.04696.0482.0...100011.01101011010003.6890.1030.8090.0340.05720.693NaNNaNNaN0000128.00.00.00.0matched (3)10.555
\n", "

5 rows × 104 columns

\n", "
" ], "text/plain": [ " host_response ... pr_filled\n", "0 Yes ... 0.412\n", "1 No or unavailable ... 0.686\n", "2 Request for more info (Can you verify? How man... ... 0.331\n", "3 I will get back to you ... 0.536\n", "4 Message not sent ... 0.555\n", "\n", "[5 rows x 104 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 1 } ] }, { "cell_type": "markdown", "metadata": { "id": "Aji09URQpjLa" }, "source": [ "The chart below shows that a Black guest receives less \"Yes\" from the hosts than a White guest. Somebody might argue that the results of Edelman et al. (2017) are driven by differences in host responses, such as conditional or non-response. For example, you could argue that Blacks are more likely to have fake accounts categorized as spam. However, note that discrimination results are driven by \"Yes\" and \"No\" and not by intermediate responses." ] }, { "cell_type": "code", "metadata": { "id": "UKNW_qjTm60G", "outputId": "9d41347d-d327-4eb8-8932-fe72c2ce635f", "colab": { "base_uri": "https://localhost:8080/", "height": 542 } }, "source": [ "# Data for bar chart\n", "count = pd.crosstab(df[\"graph_bins\"], df[\"guest_black\"])\n", "\n", "import plotly.graph_objects as go\n", "\n", "node = ['Conditional No', 'Conditional Yes', 'No',\n", " 'No Response', 'Yes']\n", "fig = go.Figure(data=[\n", " go.Bar(name='Guest is white', x=node, y=count[0]),\n", " go.Bar(name='Guest is African American', x=node, y=count[1]) ])\n", "\n", "fig.update_layout(barmode='group',\n", " title_text = 'Host Responses by Race',\n", " font=dict(size=18) )\n", "\n", "fig.show()" ], "execution_count": 2, "outputs": [ { "output_type": "display_data", "data": { "text/html": [ "\n", "\n", "\n", "
\n", " \n", " \n", " \n", "
\n", " \n", "
\n", "\n", "" ] }, "metadata": { "tags": [] } } ] }, { "cell_type": "markdown", "metadata": { "id": "PwsK4pm9N_LQ" }, "source": [ "Let's replicate the main results of Edelman et al. (2017)." ] }, { "cell_type": "code", "metadata": { "id": "NV30eUJCGpqX", "outputId": "707117c4-09e8-47a9-9153-5317fb3b3f8d", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "import statsmodels.api as sm\n", "\n", "df['const'] = 1 \n", "\n", "# Column 1\n", "# The default missing ='drop' of statsmodels doesn't apply\n", "# to the cluster variable. Therefore, it is necessary to drop\n", "# the missing values like below to get the clustered standard \n", "# errors.\n", "df1 = df.dropna(subset=['yes', 'guest_black', 'name_by_city'])\n", "reg1 = sm.OLS(df1['yes'], df1[['const', 'guest_black']])\n", "res1 = reg1.fit(cov_type='cluster',\n", " cov_kwds={'groups': df1['name_by_city']})\n", "\n", "# Column 2\n", "vars2 = ['yes', 'guest_black', 'name_by_city', \n", " 'host_race_black', 'host_gender_M']\n", "df2 = df.dropna(subset = vars2)\n", "reg2 = sm.OLS(df2['yes'], df2[['const', 'guest_black',\n", " 'host_race_black', 'host_gender_M']])\n", "res2 = reg2.fit(cov_type='cluster',\n", " cov_kwds={'groups': df2['name_by_city']})\n", "\n", "# Column 3\n", "vars3 = ['yes', 'guest_black', 'name_by_city', \n", " 'host_race_black', 'host_gender_M',\n", " 'multiple_listings', 'shared_property',\n", " 'ten_reviews', 'log_price']\n", "df3 = df.dropna(subset = vars3)\n", "reg3 = sm.OLS(df3['yes'], df3[['const', 'guest_black',\n", " 'host_race_black', 'host_gender_M',\n", " 'multiple_listings', 'shared_property',\n", " 'ten_reviews', 'log_price']])\n", "res3 = reg3.fit(cov_type='cluster',\n", " cov_kwds={'groups': df3['name_by_city']})\n", "\n", "columns =[res1, res2, res3]" ], "execution_count": 3, "outputs": [ { "output_type": "stream", "text": [ "/usr/local/lib/python3.6/dist-packages/statsmodels/tools/_testing.py:19: FutureWarning:\n", "\n", "pandas.util.testing is deprecated. Use the functions in the public API at pandas.testing instead.\n", "\n" ], "name": "stderr" } ] }, { "cell_type": "code", "metadata": { "id": "y8viPqNKct8h", "outputId": "554e40c7-bbdb-4f3b-ae43-2533de0bbce0", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "# Library to print professional publication\n", "# tables in Latex, HTML, etc.\n", "!pip install stargazer" ], "execution_count": 4, "outputs": [ { "output_type": "stream", "text": [ "Requirement already satisfied: stargazer in /usr/local/lib/python3.6/dist-packages (0.0.5)\n" ], "name": "stdout" } ] }, { "cell_type": "markdown", "metadata": { "id": "8CTftYlrCdNP" }, "source": [ " In column 1, White-sounding names are accepted 49% of the time; whereas, Black-\n", "sounding names are accepted by around 41% of the time. Therefore, a Black name carries a penalty of 8%. This result is remarkably robust to a set of control variables in columns 2 and 3." ] }, { "cell_type": "code", "metadata": { "id": "dMHO77Flch3t", "outputId": "1dec8334-f8f0-4052-e918-8bd4d37ae357", "colab": { "base_uri": "https://localhost:8080/", "height": 587 } }, "source": [ "# Settings for a nice table\n", "from stargazer.stargazer import Stargazer\n", "stargazer = Stargazer(columns)\n", "stargazer.title('The Impact of Race on Likelihood of Acceptance')\n", "stargazer" ], "execution_count": 5, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "The Impact of Race on Likelihood of Acceptance
\n", "
Dependent variable:yes
(1)(2)(3)
const0.488***0.497***0.755***
(0.012)(0.013)(0.067)
guest_black-0.080***-0.080***-0.087***
(0.017)(0.017)(0.017)
host_gender_M-0.050***-0.048***
(0.014)(0.014)
host_race_black0.069***0.093***
(0.023)(0.023)
log_price-0.062***
(0.013)
multiple_listings0.062***
(0.015)
shared_property-0.068***
(0.017)
ten_reviews0.120***
(0.013)
Observations6,2356,2356,168
R20.0060.0100.040
Adjusted R20.0060.0090.039
Residual Std. Error0.496 (df=6233)0.495 (df=6231)0.488 (df=6160)
F Statistic21.879*** (df=1; 6233)15.899*** (df=3; 6231)35.523*** (df=7; 6160)
Note:\n", " *p<0.1;\n", " **p<0.05;\n", " ***p<0.01\n", "
" ], "text/plain": [ "" ] }, "metadata": { "tags": [] }, "execution_count": 5 } ] }, { "cell_type": "markdown", "metadata": { "id": "i511M34aFziV" }, "source": [ "The table below presents the summary statistics about the hosts and properties. In an experiment, the mean values of the control variables are identical to the mean values broken by the treatment group and control group. " ] }, { "cell_type": "code", "metadata": { "id": "Dw1KePxjihFt", "outputId": "153bb7dc-3953-49ab-c283-2074685de78e", "colab": { "base_uri": "https://localhost:8080/", "height": 417 } }, "source": [ "control = ['host_race_white', 'host_race_black', 'host_gender_F', \n", "\t'host_gender_M', 'price', 'bedrooms', 'bathrooms', 'number_of_reviews', \n", "\t'multiple_listings', 'any_black', 'tract_listings', 'black_proportion']\n", "\n", "df.describe()[control].T " ], "execution_count": 6, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
countmeanstdmin25%50%75%max
host_race_white6392.00.6340.4820.00.001.001.0001.000
host_race_black6392.00.0780.2690.00.000.000.0001.000
host_gender_F6392.00.3760.4850.00.000.001.0001.000
host_gender_M6392.00.2980.4570.00.000.001.0001.000
price6302.0181.1081280.22810.075.00109.00175.000100000.000
bedrooms6242.03.1772.2651.02.002.004.00016.000
bathrooms6285.03.1692.2641.02.002.004.00016.000
number_of_reviews6390.030.86972.5050.02.009.0029.0001208.000
multiple_listings6392.00.3260.4690.00.000.001.0001.000
any_black6390.00.2820.4500.00.000.001.0001.000
tract_listings6392.09.5149.2771.02.006.0014.00053.000
black_proportion6378.00.1400.2030.00.030.050.1420.984
\n", "
" ], "text/plain": [ " count mean std ... 50% 75% max\n", "host_race_white 6392.0 0.634 0.482 ... 1.00 1.000 1.000\n", "host_race_black 6392.0 0.078 0.269 ... 0.00 0.000 1.000\n", "host_gender_F 6392.0 0.376 0.485 ... 0.00 1.000 1.000\n", "host_gender_M 6392.0 0.298 0.457 ... 0.00 1.000 1.000\n", "price 6302.0 181.108 1280.228 ... 109.00 175.000 100000.000\n", "bedrooms 6242.0 3.177 2.265 ... 2.00 4.000 16.000\n", "bathrooms 6285.0 3.169 2.264 ... 2.00 4.000 16.000\n", "number_of_reviews 6390.0 30.869 72.505 ... 9.00 29.000 1208.000\n", "multiple_listings 6392.0 0.326 0.469 ... 0.00 1.000 1.000\n", "any_black 6390.0 0.282 0.450 ... 0.00 1.000 1.000\n", "tract_listings 6392.0 9.514 9.277 ... 6.00 14.000 53.000\n", "black_proportion 6378.0 0.140 0.203 ... 0.05 0.142 0.984\n", "\n", "[12 rows x 8 columns]" ] }, "metadata": { "tags": [] }, "execution_count": 6 } ] }, { "cell_type": "markdown", "metadata": { "id": "jOGYLV2pGPiU" }, "source": [ "The balanced treatment tests (t-tests) below show that the Black and White guests are identical." ] }, { "cell_type": "code", "metadata": { "id": "HMK0M1EiihId" }, "source": [ "result = []\n", "\n", "for var in control:\n", " # Do the T-test and save the p-value\n", " pvalue = sm.OLS(df[var], df[['const', 'guest_black']],\n", " missing = 'drop').fit().pvalues[1]\n", " result.append(pvalue)" ], "execution_count": 7, "outputs": [] }, { "cell_type": "code", "metadata": { "id": "Y1sFdeAkiqJ2", "outputId": "abc750b2-1196-4aad-90c2-61d4b7eda5dd", "colab": { "base_uri": "https://localhost:8080/", "height": 417 } }, "source": [ "ttest = df.groupby('guest_black').agg([np.mean])[control].T\n", "ttest['p-value'] = result\n", "ttest" ], "execution_count": 8, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
guest_black0.01.0p-value
host_race_whitemean0.6430.6260.154
host_race_blackmean0.0780.0780.972
host_gender_Fmean0.3810.3720.439
host_gender_Mmean0.2980.2990.896
pricemean166.429195.8150.362
bedroomsmean3.1783.1760.962
bathroomsmean3.1723.1670.927
number_of_reviewsmean30.70931.0300.860
multiple_listingsmean0.3210.3300.451
any_blackmean0.2870.2770.382
tract_listingsmean9.4949.5380.848
black_proportionmean0.1410.1400.919
\n", "
" ], "text/plain": [ "guest_black 0.0 1.0 p-value\n", "host_race_white mean 0.643 0.626 0.154\n", "host_race_black mean 0.078 0.078 0.972\n", "host_gender_F mean 0.381 0.372 0.439\n", "host_gender_M mean 0.298 0.299 0.896\n", "price mean 166.429 195.815 0.362\n", "bedrooms mean 3.178 3.176 0.962\n", "bathrooms mean 3.172 3.167 0.927\n", "number_of_reviews mean 30.709 31.030 0.860\n", "multiple_listings mean 0.321 0.330 0.451\n", "any_black mean 0.287 0.277 0.382\n", "tract_listings mean 9.494 9.538 0.848\n", "black_proportion mean 0.141 0.140 0.919" ] }, "metadata": { "tags": [] }, "execution_count": 8 } ] }, { "cell_type": "markdown", "metadata": { "id": "_LhtUSOZE5s3" }, "source": [ "## Exercises" ] }, { "cell_type": "markdown", "metadata": { "id": "tx6eyoPl3yWU" }, "source": [ "1| To the best of my knowledge, the 3 most important empirical papers in the literature of racial discrimination are Bertrand & Mullainathan (2004), Oreopoulos (2011), and Edelman et al. (2017). These 3 papers use a field experiment to capture causality and rule out confound factors. Search on the Internet and return a reference list of experimental papers about racial discrimination." ] }, { "cell_type": "markdown", "metadata": { "id": "zk0rssc65YXj" }, "source": [ "2| Tell me a topic that you are passionate. Return a reference list of experimental papers about your topic." ] }, { "cell_type": "markdown", "metadata": { "id": "m9vmejzD1vSL" }, "source": [ "3| Somebody argues that specific names drive the results of Edelman et al. (2017). In the tables below, you can see that there are not many different names representing Black and White. How can this critic be refuted? What can you do to show that results are not driven by specific names?" ] }, { "cell_type": "code", "metadata": { "id": "o-cK7fW-SnA_", "outputId": "80c49260-be15-4fa3-9d9f-337a92bd6c67", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "female = df['guest_gender']=='female'\n", "df[female].groupby(['guest_race', 'guest_first_name'])['yes'].mean()" ], "execution_count": 9, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "guest_race guest_first_name\n", "black Lakisha 0.433\n", " Latonya 0.370\n", " Latoya 0.442\n", " Tamika 0.482\n", " Tanisha 0.413\n", "white Allison 0.500\n", " Anne 0.567\n", " Kristen 0.486\n", " Laurie 0.508\n", " Meredith 0.498\n", "Name: yes, dtype: float64" ] }, "metadata": { "tags": [] }, "execution_count": 9 } ] }, { "cell_type": "code", "metadata": { "id": "IuQ3O1-MUkkP", "outputId": "3e3755ca-144b-4638-9877-95f51f882782", "colab": { "base_uri": "https://localhost:8080/" } }, "source": [ "male = df['guest_gender']=='male'\n", "df[male].groupby(['guest_race', 'guest_first_name'])['yes'].mean()" ], "execution_count": 10, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "guest_race guest_first_name\n", "black Darnell 0.412\n", " Jamal 0.354\n", " Jermaine 0.379\n", " Kareem 0.436\n", " Leroy 0.371\n", " Rasheed 0.409\n", " Tyrone 0.377\n", "white Brad 0.419\n", " Brent 0.494\n", " Brett 0.466\n", " Greg 0.467\n", " Jay 0.581\n", " Todd 0.448\n", "Name: yes, dtype: float64" ] }, "metadata": { "tags": [] }, "execution_count": 10 } ] }, { "cell_type": "markdown", "metadata": { "id": "kyWJyokoyR-I" }, "source": [ "4| Is there any potential research question that can be explored based on the table below? Justify." ] }, { "cell_type": "code", "metadata": { "id": "slexRu8m0O-M", "outputId": "b71d8c7e-2d1d-40fe-ab7f-36936bc1e0d5", "colab": { "base_uri": "https://localhost:8080/", "height": 570 } }, "source": [ "pd.crosstab(index= [df['host_gender_F'], df['host_race']],\n", " columns=[df['guest_gender'], df['guest_race']], \n", " values=df['yes'], aggfunc='mean')" ], "execution_count": 11, "outputs": [ { "output_type": "execute_result", "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
guest_genderfemalemale
guest_raceblackwhiteblackwhite
host_gender_Fhost_race
0UU0.4000.5420.1580.381
asian0.3190.3780.4740.511
black0.4440.6430.4190.569
hisp0.4640.5710.3750.478
mult0.5680.7270.4080.357
unclear0.4440.5000.4440.333
unclear_three votes0.4760.3920.3680.367
white0.3830.5140.3860.449
1UU0.4440.2500.3330.750
asian0.4290.6070.4360.460
black0.6030.5370.3970.446
hisp0.3910.6670.2920.389
unclear0.6000.5560.1250.400
unclear_three votes0.3870.5830.3120.657
white0.4500.4940.3700.476
\n", "
" ], "text/plain": [ "guest_gender female male \n", "guest_race black white black white\n", "host_gender_F host_race \n", "0 UU 0.400 0.542 0.158 0.381\n", " asian 0.319 0.378 0.474 0.511\n", " black 0.444 0.643 0.419 0.569\n", " hisp 0.464 0.571 0.375 0.478\n", " mult 0.568 0.727 0.408 0.357\n", " unclear 0.444 0.500 0.444 0.333\n", " unclear_three votes 0.476 0.392 0.368 0.367\n", " white 0.383 0.514 0.386 0.449\n", "1 UU 0.444 0.250 0.333 0.750\n", " asian 0.429 0.607 0.436 0.460\n", " black 0.603 0.537 0.397 0.446\n", " hisp 0.391 0.667 0.292 0.389\n", " unclear 0.600 0.556 0.125 0.400\n", " unclear_three votes 0.387 0.583 0.312 0.657\n", " white 0.450 0.494 0.370 0.476" ] }, "metadata": { "tags": [] }, "execution_count": 11 } ] }, { "cell_type": "markdown", "metadata": { "id": "NZP-ULypPfqH" }, "source": [ "5| In Edelman et al. (2017), the variable \"name_by_city\" was used to cluster the standard errors. How was the variable \"name_by_city\" created based on other variables? Show the code.\n", "\n" ] }, { "cell_type": "markdown", "metadata": { "id": "kXfXnqgXgT9g" }, "source": [ "6| Use the data from Edelman et al. (2017) to test the homophily hypothesis that hosts might prefer guests of the same race. Produce a nice table using the library Stargazer. Interpret the results. " ] }, { "cell_type": "markdown", "metadata": { "id": "YaHJyQDpU_pu" }, "source": [ "7| Overall, people know that socioeconomic status is correlated with race. Fryer & Levitt (2004) showed that distinct/unique African American names are correlated with lower socioeconomic status. Edelman et al. (2017: 17) clearly state: \"Our findings cannot identify whether the discrimination is based on race, socioeconomic status, or a combination of these two.\"\n", "Propose an experimental design to disentangle the effect of race from socioeconomic status. Explain your assumptions and describe the procedures in detail." ] }, { "cell_type": "markdown", "metadata": { "id": "i-MPIny2O8Om" }, "source": [ "## Reference" ] }, { "cell_type": "markdown", "metadata": { "id": "ZOwX1_TNpbMt" }, "source": [ "Bertrand, Marianne, and Sendhil Mullainathan. (2004). [Are Emily and Greg More Employable Than Lakisha and Jamal? A Field Experiment on Labor Market Discrimination](https://github.com/causal-methods/Papers/raw/master/Are%20Emily%20and%20Greg%20More%20Employable%20than%20Lakisha%20and%20Jamal.pdf). American Economic Review, 94 (4): 991-1013. \n", "\n", "Edelman, Benjamin, Michael Luca, and Dan Svirsky. (2017). [Racial Discrimination in the Sharing Economy: Evidence from a Field Experiment](https://github.com/causal-methods/Papers/raw/master/Racial%20Discrimination%20in%20the%20Sharing%20Economy.pdf). American Economic Journal: Applied Economics, 9 (2): 1-22.\n", "\n", "Fryer, Roland G., Jr., and Steven D. Levitt. (2004). The Causes and Consequences of Distinctively Black Names. Quarterly Journal of Economics 119 (3): 767–805.\n", "\n", "Oreopoulos, Philip. (2011). [Why Do Skilled Immigrants Struggle in the Labor Market? A Field Experiment with Thirteen Thousand Resumes](https://github.com/causal-methods/Papers/raw/master/Oreopoulos/Why%20Do%20Skilled%20Immigrants%20Struggle%20in%20the%20Labor%20Market.pdf). American Economic Journal: Economic Policy, 3 (4): 148-71.\n" ] } ] }