Home » Posts tagged 'Pitching'

Tag Archives: Pitching

Fangraphs Seasonal Pitching Data, 2000 – 2019

This data set contains seasonal pitching data from the MLB from 2000 to 2019.

Download a Zip Archive of the Data and Script (includes CSV and RDS)

Click here for an explanation of the variables

Wrangling Operation

The operation requires the following packages, particularly Bill Pettiā€™s excellent baseballr package for wrangling MLB data:

# Download the Baseball R Package:
# devtools::install_github(repo = "BillPetti/baseballr")

library(baseballr)
library(rvest)
library(plyr)

First, I create a table of player identifiers from the Chadwick Baseball Bureau using get_chadwick_lu() in baseballr. These identifiers will help users merge this data table to other baseball data.

# Player Identifiers
dat.playerid <- get_chadwick_lu()
dat.playerid <- dat.playerid[c(1:7,13:15,19,25:28)]
saveRDS(dat.playerid, "Player Identifiers.RDS")
identifiers <- readRDS("Player Identifiers.RDS")

Next, I download seasonal MLB pitching performance data from Fangraphs through fg_pitch_leaders()

# Scraping Batting Data
for (i in 2000:2019){
  temp <- fg_pitch_leaders(i, i, league = "all", qual = "n", ind = 1)
  assign(paste0("fg_pitch_", i), temp)
}

As these are all identical versions of the same data table, just representing different years, I can stack them together using rbind():

dat.pit <- fg_pitch_2000
for (i in 2000:2019){
  temp <- get(paste0("fg_pitch_", i))
  temp <- rbind(dat.pit, temp)
  assign("dat.pit", temp)
}

Rename the identifier in the Fangraphs table so that it is the same as the Chadwick Bureau identifer data. I then merge the two sets so that the batting data can more readily be merged with other sources.

names(dat.pit)[1] <- paste("key_fangraphs")

# Clean Up Data Types and Sort
dat.pit$key_fangraphs <- as.numeric(dat.pit$key_fangraphs)
dat.pit$Season <- as.numeric(dat.pit$Season)
dat.pit <- arrange(dat.pit, Name, Season)

# Write data
write.csv(dat.pit, "Fangraphs Pitching Leaders 2000 - 2019.csv")
saveRDS(dat.pit, "Fangraphs Pitching Leaders 2000 - 2019.RDS")

# Clean up memory 
rm(list=ls(pattern = "fg_pitch"))

Photo Credit. By derivative work: Amineshaker (talk)Image:Nolan_Ryan_in_Atlanta.jpg: Wahkeenah – Image:Nolan_Ryan_in_Atlanta.jpg|200px, Public Domain, https://commons.wikimedia.org/w/index.php?curid=5022538

css.php
Need help with the Commons? Visit our
help page
Send us a message