#1 Hypothesis Testing: Effect Of Recession On Housing Prices

The following is a part of my DataScience Project log.

Every year more and more high school graduates in the United States elect for tertiary education, some by interest, some by necessity, this has lead to an influx in the number of university towns across the country.

University Town: A city, town, or district, that is dominated by its university population.


Though their numbers are still quite sparse compared to the total amount of cities, (only about 3%) These so-called “University towns” prove to be a good study as to how a shift in the major demographic (both age and work) affects the housing market, As, in most cases, local residents are employed by the university itself, which may be the largest employer in the community, even the local businesses may cater primarily to the university.

Since Universities are far less likely to go bankrupt during a recession, (for reasons, I would discuss at the end), they should, in theory, provide some buffer against outside market crashes.

Since housing is a universal market present in all regions throughout the country, I picked it as a common measurement.

The goal of this testing is to see if the idea of relying heavily on these “low-risk” educational institutions as your town’s economic backbone does indeed have any statistically significant benefit? A post-discussion will go over some of the reasons behind the findings and ponder its effects on the long term.


City_Zhvi_AllHomes.csv: contains the median home sale prices per month of all Regions & States in the United States.

University_towns.txt, contains the list of all university towns in the United States.

gdplev.xls, Contains the GDP figures of the United States per quarter.

NOTE: The data here is from 2016, but since there hasn’t been a major recession in the time period between 2016-2020 and that only a few new university towns have been added, this wouldn’t affect the result, at least to an appreciable degree.


Quarter: A specific three month period, Q1 is January through March, Q2 is April through June, Q3 is July through September, Q4 is October through December.

Recession: A time period starting with two consecutive quarters of GDP decline and ending with two consecutive quarters of GDP growth.

Recession Bottom: The quarter within a recession which had the lowest GDP.

University Town: A city that has a high percentage of university students compared to the total population of the city.

Non-University Town: A city that has a low to normal percentage of university students compared to the total population of the city. For the following analysis, I simplified it to any city that doesn’t fall under University Town.


University town’s mean housing prices are equally likely to be affected by a recession as any other city.

Null Hypothesis


University town’s mean housing prices are less likely to be affected by a recession than any other city.

Alternate Hypothesis


Importing all the relevant libraries used,

First I cleaned the data present in the university_towns.txt file. The goal was to return a dataframe with the State and Region Names as Columns.

Next, I loaded up the gdplev.xls file and created a function get_recession_start() that split the GDP per quarter figures into three columns, GDP-1 representing GDP of the previous quarter while GDP and GDP+1 representing figures for current and future quarters respectively. I then simply compared the values in each column looking for two consecutive quarterly declines.

I used the same method described above in function get_recession_end(), this time, looking for two consecutive quarters of growth.

Having identified the start and end point of the recession, I sorted the GDP figures between this timeframe in ascending order to find the quarter with the lowest GDP figures in get_recession_bottom()

Since the housing data in City_Zhvi_AllHomes.csv was recorded on a monthly basis, It’s structurally incompatible with the quarterly data in the gdplev.xls file. To fix that, I used the naming convention employed for the years columns.

  • 2001-01 –> January 2001
  • 2001-02 –> Febuary 2001

running this pattern through convert_to_qtr() which used the simple logic,

Using this, I was able to successfully redistribute them in a quarterly format,

Next, In the convert_housing_data_to_quarters() compiled the result dataframe, pivoted around State, RegionName, applying Numpy.mean() across all quarterly columns.

Now that all the data is cleaned and configured for examination, I set up the proposed scenario in my hypothesis. Namely, the mean price trends of houses during the start and bottom of a recession.

I then separated university towns from non-university towns, calling resulting data frames as uni_towns and non_uni_towns respectively,

Atlas, I ran a ttest on the two dataframes.

The result were.

(True, 0.0043252148534599624, 'university town')

With a p-value < 0.5, This indicates that the Alternate Hypothesis was indeed correct 😊

University town’s mean housing prices are less likely to be affected by a recession than any other city.

Alternate Hypothesis

Like always, The full code in its entirety can be found here.


As I promised, This post-discussion section will discuss some of the reasons behind the findings.

Universities, at least in the US, are quite similar to companies in the service sector, that is, they both provide a service in exchange for a fee. (in this case, a university may provide a student access to the following services)

  • Classroom lectures.
  • Open campuses.
  • Access to libraries.
  • Guidance from professors.
  • Career counseling.

So why then, are they at low risk of going bankrupt in a recession compared to their service sector analogous? To answer that, we need not look further than its targeted consumer base, aka. Students, young, bright, passionate, and desperate, these consumers don’t make decisions based on their current financial condition but on the prospect of their future one.

Even if my current financial condition isn’t great, If I can get a degree, I may land a well-paying job!

-Every Student

This belief in a brighter future is the driving force behind the 1.6 Trillion dollars student-loan debt.

Truly, other markets can only dream of having such a lucrative consumer base that would throw money at you, even in the midst of a recession.

Do not be fooled however, relying solely on a single (or a small group of,) institution to serve as the major backbone of your entire town’s economy can be its own undoing. If for an unexplained reason, the university has to shut its operation, the entire population will suffer severe economic downfall, diversifying your economy to have multiple fail-safes is almost always a better idea, and will prove to be far stable in the long term.

About Me!

An aspiring data scientist with a great interest in machine learning and its applications. I post my work here in the hope to improve over time.