Everything is Seasonal

Zan Armstrong - @zanstrong

Everything* is Seasonal

*related to change over time

If your data includes change over time, take seasonality into account

#1 assume seasonality

#1 assume seasonality

Maybe?

 

 

Or maybe it's just November

#1 assume seasonality

Number of commuters  traffic

#1 assume seasonality

Number of commuters  traffic

So... if everybody takes a 1 week summer vacation during the 10 weeks of summer...

#1 assume seasonality

..that's ~10% fewer commuters per week

Number of commuters  traffic

#1 assume seasonality

Sunrise in Boston in early November: ~7:20am

#1 assume seasonality

Sunrise in Boston in summer: ~5:30am

#1 assume seasonality

Numbers of commuters per hour  ➔ traffic

Length of rush hour 

Time of Sunrise  

#1 assume seasonality

#1 assume seasonality

 

assume that your

metric has seasonality

#1

consider the seasonality

of causal factors

What is Seasonality?

patterns that repeat

over known, fixed

periods of time

- Wikipedia

what is seasonality

what is seasonality

what is seasonality

Time is a Dimension: Fong Qi Wei

what is seasonality

San Francisco Weather

what is seasonality

what is seasonality

what is seasonality

what is seasonality

C02 concentration

what is seasonality

what is seasonality

Births

what is seasonality

what is seasonality

what is seasonality

what is seasonality

what is seasonality

what is seasonality

what is seasonality

what is seasonality

384 babies were born at 9:31am on Saturdays during 2014

Saturdays

what is seasonality

Saturdays

Mondays

what is seasonality

what is seasonality

8am

what is seasonality

8am

12:45pm

8am

12:45pm

what is seasonality

8am

12:45pm

5:30pm

what is seasonality

what is seasonality

what is seasonality

Types of Seasonality

Minute/Hour of Day

Day of Week

Week of the Year

what is seasonality

Types of Seasonality

Minute/Hour of Day

Day of Week

Week of the Year

what is seasonality

Types of Seasonality

Minute/Hour of Day

Day of Week

Week of the Year

what is seasonality

What about Monthly?

aggregate meaningfully

photo by Dominic Alves

 

aggregate meaningfully

$4000 per day in revenue

aggregate meaningfully

Daily Revenue from Restaurant

$4000 per day

Really boring line chart!

aggregate meaningfully

1.6% year over year growth

$4000 per day in revenue

aggregate meaningfully

Daily Revenue from Restaurant

 $4001.20 on

Fri Jan 11, 2013

aggregate meaningfully

 $4064.10 on

Fri Jan 10, 2014

Daily Revenue

1.6% y/y growth

Daily Year over Year Growth in Revenue

aggregate meaningfully

1.6% year over year growth

$4000 per day in revenue

week of year seasonality

aggregate meaningfully

Daily Revenue

$3619 on Sat May 11th

$4826 on Sun May 12th

aggregate meaningfully

Daily Revenue

still 1.6% y/y growth: comparing summer to summer 

Daily Year over Year Growth in Revenue

aggregate meaningfully

1.6% year over year growth

$4000 per day in revenue

week of year seasonality

day of week seasonality

aggregate meaningfully

Daily Revenue

aggregate meaningfully

Daily Revenue

Just one month!

aggregate meaningfully

Daily Revenue

Closed Mondays: no revenue

aggregate meaningfully

Daily Revenue

Big weekends! Dinner & Brunch

aggregate meaningfully

Daily Revenue

Growth (compared to previous year) -- Still Boring.

aggregate meaningfully

1.6% year over year growth

$4000 per day in revenue

week of year seasonality

day of week seasonality

aggregate meaningfully

aggregate by week

Weekly Revenue

Weekly Growth (compared to previous year)

aggregate meaningfully

Weekly Revenue

Weekly Growth

aggregate meaningfully

5% bump in March/April

aggregate by month

Monthly Revenue

aggregate meaningfully

Monthly Revenue

aggregate meaningfully

Monthly Growth (compared to previous year)

Monthly Revenue

aggregate meaningfully

Monthly Growth (compared to previous year)

aggregate meaningfully

What's going on???

1 2 3 4 5 6
7 8 9 10 11 12 13
14 15 16 17 18 19 20
21 22 23 24 25 26 27
28 29 30

Mon

Tues

Wed

Thurs

Fri

Sat

Sun

April 2014

closed

weekend

extra days!

aggregate meaningfully

1 2 3 4 5
6 7 8 9 10 11 12
13 14 15 16 17 18 19
20 21 22 23 24 25 26
27 28 29 30

Mon

Tues

Wed

Thurs

Fri

Sat

Sun

April 2015

closed

weekend

extra days!

aggregate meaningfully

April 2016

aggregate meaningfully

1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30

Mon

Tues

Wed

Thurs

Fri

Sat

Sun

closed

weekend

extra days!

Weekly Revenue

Weekly Growth (compared to previous year)

aggregate meaningfully

Monthly Revenue

aggregate meaningfully

Monthly Growth (compared to previous year)

Really simple time series

 

1.6% y/y growth

consistent day of week pattern

consistent week of year pattern

No random variation.

No holidays.

No variation due to weather.

No decreases due to a bad review.

No short or long-term variation due to marketing campaigns, good press, or a new chef.

No change in trend, day of week seasonality, or week of year seasonality.

aggregate meaningfully

Monthly aggregation is a bad idea unless... 

 

 

...the data is inherently monthly.

aggregate meaningfully

#2

Aggregate to time periods

that make sense for your data

Daily data can be hard to interpret

compare apples to apples

year over year growth (-364 days) can help

Daily data can be hard to interpret

compare apples to apples

But, not if it's a CALENDAR (-365 or -366 day) year!

Compares to a calendar year,

instead of 364 days back

compare apples to apples

Just change it!

compare apples to apples

 

Compare  apples to apples 

 

 

If calculating daily or weekly year/year growth, compare 364 days back.

compare apples to apples

"

"

#3

sometimes seasonality

is

the story

seasonality is the story

1985 study on deaths due to tractor accidents 

seasonality is the story

harvesting

planting

11am-noon

4pm to 5pm

Deaths by Location

Deaths by Hour

Deaths by Month

Deaths by Age

seasonality is the story

#4

Account for seasonality when estimating impact of an event ( causal  analysis)

 

 

"                "

Time Series Disruptions

Expected: Holidays, Sales, Events

Unexpected, but common: Weather

Unexpected and uncommon: Natural disaster, Terrorism, death of CEO, mergers

Effect? Short-term? Long-term?

account for seasonality

Sept 2001 - month of 9/11 

 

 

Gun Sales Increased by 28%

account for seasonality

Jan 2013 - Obama's 2cd innaguration

 

 

Gun Sales Dropped by 21%

account for seasonality

Feb 2011: nothing special 

 

Gun Sales Increased by 21%

account for seasonality

Title Text

Look at the Time Series

account for seasonality

 

If we're aware that seasonality matters, what can we do to take it into account?

 

account for seasonality

account for seasonality

Gregor Aisch and Josh Keller

account for seasonality

Gregor Aisch and Josh Keller

Gregor Aisch

account for seasonality

#1. Look at year over year growth (-364 days!)

#2.  Isolate seasonal component: 

Accounting for Seasonality

 

Decomposing a time series with STL

account for seasonality

Long Term Trend

Month of Year Seasonality

Disruptions

Decomposed Time Series:

account for seasonality

Decomposed Time Series:

account for seasonality

Long Term Trend

account for seasonality

Long Term Trend

account for seasonality

Long Term Trend

Question: How have gun sales changed over the last 15 years?

account for seasonality

Month of Year Seasonality

account for seasonality

2002

2004

2006

2008

2010

Month of Year Seasonality

account for seasonality

Month of Year Seasonality

Question: During which months are the most guns sold? And the least?

 

 

account for seasonality

Is this changing?

Remainder, Disruptions, Irregular, One-off Events

account for seasonality

Remainder, Disruptions, Irregular, One-off Events

account for seasonality

Question: When did gun sales spike or dip unusually? And by how much?

account for seasonality

Remainder, Disruptions, Irregular, One-off Events

Remainder, Disruptions, Irregular, One-off Events

account for seasonality

Decomposed Time Series:

account for seasonality

#5

The seasonality that matters might be in a subset of your data

Challenge

Describe a scenario in which ignoring the seasonality of a subset of the data would lead to misinterpreting the aggregate data.

seasonality is a subset

Hint 1: different seasonality? different growth rate?

 

 

Hint 2: seasonality in a component defined by Principal Component Analysis

#6

People and places are different. So are their seasonalities.

our lives, and our data, are full of seasonal patterns

different seasonalities

Why People Visit the Emergency Room:  Nathan Yau

Football

Nails, Screws, Tacks, or Bolts

different seasonalities

it doesn't have to be a line chart

different seasonalities

Flickr Flow: Martin Wattenberg & Fernanda Viégas

different seasonalities

Weather  Circles : (me)

different seasonalities

"                 "

Visualizing MBTA Data: Mike Barry & Brian Card

different seasonalities

Ville Vivante: Interactive Things

different seasonalities

Traffic Accidents: Nadieh Bremer 

different seasonalities

or even a chart at all

different seasonalities

The doors Giorgia opens.

different seasonalities

#1

Consider the seasonality

of causal factors

#2

Aggregate to time periods

that make sense for your data

#3

sometimes seasonality

is

the story

#4

Adjust for seasonality when estimating impact of an event ( causal  analysis)

 

 

"                "

#5

The seasonality that matters might be in a subset of your data

#6

Seasonal patterns vary by place, culture, and lifestyle

More info?

See github repo zanarmstrong/everything-is-seasonal 

 

If your data includes change over time, take seasonality into account

Made with Slides.com