Karl Ho
School of Economic, Political and Policy Sciences
University of Texas at Dallas
https://en.wikipedia.org/wiki/Nathan%27s_Hot_Dog_Eating_Contest
library(RColorBrewer)
hotdogs <-
read.csv("http://datasets.flowingdata.com/hot-dog-contest-winners.csv",
sep=",", header=TRUE)
barplot(hotdogs$Dogs.eaten, names.arg=hotdogs$Year, col="red",
border=NA, xlab="Year", ylab="Hot dogs and buns (HDB) eaten")
fill_colors <- c()
for ( i in 1:length(hotdogs$Country) ) {
if (hotdogs$Country[i] == "United States") {
fill_colors <- c(fill_colors, "grey")
} else {
fill_colors <- c(fill_colors, "#2ca25f")
}
}
barplot(hotdogs$Dogs.eaten, names.arg=hotdogs$Year, col=fill_colors,
border=NA, xlab="Year", ylab="Hot dogs and buns (HDB) eaten")
https://en.wikipedia.org/wiki/Nathan%27s_Hot_Dog_Eating_Contest
# Stack bar data
hot_dog_places <-
read.csv("http://datasets.flowingdata.com/hot-dog-places.csv", sep=",", header=TRUE)
names(hot_dog_places) <- c("2000", "2001", "2002", "2003", "2004",
"2005", "2006", "2007", "2008", "2009", "2010")
hot_dog_matrix <- as.matrix(hot_dog_places)
barplot(hot_dog_matrix, border=NA, space=0.25, ylim=c(0, 200),
xlab="Year", ylab="Hot dogs and buns (HDBs) eaten",
main="Hot Dog Eating Contest Results, 1980-2010")
# Can this be plotted using ggplot2 stack bar?
library(quantmod)
library(ggplot2)
library(magrittr)
library(broom)
# Get Apple stock prices
getSymbols("AAPL", src="yahoo")
chartSeries(get("AAPL"), subset='last 4 months')
## Plotting multiple series using ggplot2
# Setting time period
start = as.Date("2020-10-01")
end = as.Date("2020-11-10")
# Collect stock names from Yahoo Finance
getSymbols(c("AAPL", "FB", "TSM", "PFE"), src = "yahoo", from = start, to = end)
# Prepare data as xts (time series object)
stocks = as.xts(data.frame(AAPL = AAPL[, "AAPL.Adjusted"],
FB = FB[, "FB.Adjusted"],
TSM = TSM[, "TSM.Adjusted"],
PFE = PFE[, "PFE.Adjusted"]))
# Index by date
names(stocks) = c("Apple", "Facebook", "Taiwan Semiconductor Manu.", "Pfizer")
index(stocks) = as.Date(index(stocks))
# Plot
stocks_series = tidy(stocks) %>%
ggplot(aes(x=index,y=value, color=series)) +
geom_line(cex=1) +
theme_bw()
stocks_series
stocks_series = tidy(stocks) %>%
ggplot(aes(x=index,y=value, color=series)) +
geom_line(cex=1) +
theme_bw() +
labs(title = "Daily Stock Prices, 10/1/2020 - 11/10/2020",
subtitle = "End of Day Adjusted Prices",
caption = "Source: Yahoo Finance") +
xlab("Date") + ylab("Price") +
scale_color_manual(values = c("steelblue", "red", "brown","purple")) +
theme(text = element_text(family = "Apple Garamond"))
GitHub: forecast
Author: Rob Hyndman
Current version:
Purpose: Foremost package for automatic time series forecasting.
auto.arima()
: Automatically selects optimal ARIMA model.ets()
: Exponential smoothing state space model.tbats()
: Exponential smoothing state space model with Box-Cox transformation.GitHub: fable
Author: Mitchell O'Hara-Wild
Current version: 0.3.3 (CRAN)
Purpose: commonly used univariate and multivariate time series forecasting models including exponential smoothing via state space models and automatic ARIMA modelling.
Current version: 0.10-54 (CRAN)
Purpose: toolkit for time series modeling and hypothesis testing.
Key Features:
Functions for stationarity tests like adf.test().
Time series regression capabilities.
ARCH and GARCH modeling functions.
Sample code:
adf.test(diff(AirPassengers))
Purpose: Managing ordered observations, especially essential for time series data.
Key Features:
Handles irregular time series with zoo.
xts extends zoo to add more powerful time series capabilities.
Time-based subsetting and alignment.
Sample code:
z <- zoo(1:10, Sys.Date() + 1:10)
x <- as.xts(z)
Current version: TSA 1.3.1 (CRAN),
Purpose: Primarily used for educational purposes. Offers utilities for time series analysis.
Key Features:
Time series decomposition.
Various hypothesis testing methods.
Simulation capabilities.
Sample code:
decomposed <- decompose(AirPassengers)
plot(decomposed)
print()
or plot()
, which operate differently based on the type of data passed to them.For example, when using print() with a \(lm\) object (linear model), R dispatches the print.lm function because lm is the class of the object. The process involves:
Detecting the class of the object, say "classname".
Searching for a method named generic.classname, where "generic" is the generic function's name.
Executing the found method, or defaulting to a base method if no class-specific method is found.
Advantages:
Flexibility: Easy to extend existing generic functions with new class methods.
Simplicity: Offers a straightforward approach to object-oriented programming without the need for formal class definitions.
Limitations:
Informality: Lacks the strict class definitions seen in formal systems like S4, which can lead to less rigorous class structures.
Absence of formal inheritance: While some inheritance behavior can be mimicked, S3 lacks a robust inheritance mechanism.
The S3 class system provides an intuitive mechanism for object-oriented programming in R, allowing for flexibility in function behaviors based on object classes.
Although it lacks the formality of systems like S4, its simplicity and adaptability have made it a staple in the R programming environment.
The S4 class system in R offers a structured and rigorous approach to object-oriented programming, with clear distinctions from the S3 system. While S4 ensures strict object definitions and robust method dispatch, its complexity makes it less widespread compared to the more accessible and flexible S3 system. However, the choice between S3 and S4 often hinges on the specific requirements of the task at hand.
Chambers, John M. 1998. Programming with Data. Springer.
Chambers, John .M. 2008. Software for Data Analysis: Programming with R. Springer.
Wickham, Hadley. 2019. Advanced R. CRC Press.