How Can We Analyze Time Series Data in the Era of COVID-19?

Caleb Elgut
3 min readSep 25, 2020

I recently completed my 4th module project at The Flatiron School. This time, I analyzed time-series data from Zillow, which showed the median real estate values of zip codes around the United States from April 1996 to April 2018. It was an incredibly enlightening experience; however, I can’t help but wonder if time-series predictions will be able to handle global disasters as unpredictable as COVID-19.

The goal was to discover the best zip codes to invest in for the next few years. After finding rural zip codes within a reasonable risk and value, I analyzed the five zip codes that gave the best ROI based on the data I received.

These zip codes were 48894 (Westphalia, MI), 56360 (Osakis, MN), 40008 (Bloomfield, KY), 49339 (Pierson, MI), and 27019 (Germanton, NC)

The Initial Time Series Analysis

I used a series of tools at my disposal to remove the trend and seasonality from this data. Our models do not predict well if the information is not stationary, which is to say, void of trend, seasonality, and variance in the mean value.

Finally, I used time series models (particularly SARIMA, which stands for Seasonal AutoRegressive Integrated Moving Average) to analyze the change in real estate values over time and forecast the ROI over the next ten years. For those wondering, according to my assessment (which I may update after my code review!) Bloomfield Kentucky is set to increase your ROI by over 1,000% over the next ten years! Osakis, Minnesota, comes in second with 575%. This experience showed me the power of time series data predictions!

However, there was only one issue: this data stopped in 2018 — years before COVID-19 would hit.

I am almost sure that none of my predictions will come true as time series models from all over the world have had to reset entirely in the face of the pandemic. These models are incredibly valuable to those in the market of buying and selling anything and finance in general. They help determine how supply chains will work and how to plan new technology for the future. Think about Apple, for example. Why do you think their iPhones seem to come out around every two or so years? They look at the data that shows them purchasing habits over time and determine that two years is around the time folks will start looking for a new phone (and, yes, it helps that their software tends to force their customers).

All those models are rendered useless now! Everyone has to start over. While I understand that such a pandemic as what we have experienced this year is never predictable, I wonder if there is a feature for predicting disasters of various grades that can be implemented into time series analysis? Perhaps it will allow companies to create multiple backup plans to prepare better for something like this hits next time. COVID-19 may be a once-in-a-century pandemic, but we have another major event that may hit us in the next few years: a catastrophe due to climate change. One could argue we are already experiencing this given the fires that have broken out across the west coast of The United States. I would love to delve deeper into time series data in the future to see what, if any, measures are being taken to implement disaster prediction into time series models.

--

--