A ways back I put up a post that uses R to plot the scoring trends of an NHL player. Given the recent chatter on sports talk radio around Boston, I used my script to plot the data for Michael Ryder. His scoring history plot is shown below.
A ways back I put up a post that uses R to plot the scoring trends of an NHL player. Given the recent chatter on sports talk radio around Boston, I used my script to plot the data for Michael Ryder. His scoring history plot is shown below.
That’s cool! I simplified the script somewhat, added an is.null() check because the script didn’t work when a season’s data were missing. I also plot the number of assists. Check this out (for Brad Richards): http://i51.tinypic.com/nv5umv.jpg
## libraries
library(XML)
library(ggplot2)
# Set the constants
BASE <- "http://www.hockey-reference.com/players/r/richabr01/gamelog/"
SEASON <- c(2001:2011)
# Loop and grab the data
ds <- data.frame()
for (S in SEASON) {
URL <- paste(BASE, S, "/", sep="")
tables <- readHTMLTable(URL)$stats
if(!is.null(tables)){
tables$season <- S
ds <- rbind(ds, tables)
}
}
ds$Date <- as.Date(ds$Date)
ds <- na.omit(ds[order(ds$Date),c(4,2,9,10,21)])
names(ds) <- c('Date', 'Game', 'Goals', 'Assists', 'Season')
for(i in 2:ncol(ds)){ds[,i] <- as.numeric(as.character(ds[,i]))}
# Get cumulative values by season for goals and assists
ds <- split(ds, ds$Season)
ds <- lapply(ds, transform, Goals = cumsum(Goals))
ds <- lapply(ds, transform, Assists = cumsum(Assists))
ds <- do.call('rbind', ds)
# Stack data frames to make it easier on ggplot2
ds.a <- ds
ds.a$Goals <- ds.a$Assists
ds.a$Type <- 'Assists'
ds$Type <- 'Goals'
ds <- rbind(ds, ds.a)
# Plot it
ggplot(ds, aes(Game, Goals, color=Type)) + geom_line(size=1) +
facet_wrap(~Season, ncol=2) +
xlab('Games') + ylab('') + scale_colour_manual(values = c('blue', 'orange')) +
opts(title='Brad Richards career performance \n Cumulative goals and assists by season')
ggsave('BRichards.png')
I just saw this. Well done!