Baseball Data#

Major League Baseball Data from the 1986 and 1987 seasons.

  • AtBat: Number of times at bat in 1986

  • Hits: Number of hits in 1986

  • HmRun: Number of home runs in 1986

  • Runs: Number of runs in 1986

  • RBI: Number of runs batted in in 1986

  • Walks: Number of walks in 1986

  • Years: Number of years in the major leagues

  • CAtBat: Number of times at bat during his career

  • CHits: Number of hits during his career

  • CHmRun: Number of home runs during his career

  • CRuns: Number of runs during his career

  • CRBI: Number of runs batted in during his career

  • CWalks: Number of walks during his career

  • League: A factor with levels A and N indicating player’s league at the end of 1986

  • Division: A factor with levels E and W indicating player’s division at the end of 1986

  • PutOuts: Number of put outs in 1986

  • Assists: Number of assists in 1986

  • Errors: Number of errors in 1986

  • Salary: 1987 annual salary on opening day in thousands of dollars

  • NewLeague: A factor with levels A and N indicating player’s league at the beginning of 1987

Notes#

This dataset was taken from the StatLib library which is maintained at Carnegie Mellon University. This is part of the data that was used in the 1988 ASA Graphics Section Poster Session. The salary data were originally from Sports Illustrated, April 20, 1987. The 1986 and career statistics were obtained from The 1987 Baseball Encyclopedia Update published by Collier Books, Macmillan Publishing Company, New York.

from ISLP import load_data
Hitters = load_data('Hitters')
Hitters.columns
Index(['AtBat', 'Hits', 'HmRun', 'Runs', 'RBI', 'Walks', 'Years', 'CAtBat',
       'CHits', 'CHmRun', 'CRuns', 'CRBI', 'CWalks', 'League', 'Division',
       'PutOuts', 'Assists', 'Errors', 'Salary', 'NewLeague'],
      dtype='object')
Hitters.shape
(322, 20)
Hitters.columns
Index(['AtBat', 'Hits', 'HmRun', 'Runs', 'RBI', 'Walks', 'Years', 'CAtBat',
       'CHits', 'CHmRun', 'CRuns', 'CRBI', 'CWalks', 'League', 'Division',
       'PutOuts', 'Assists', 'Errors', 'Salary', 'NewLeague'],
      dtype='object')
Hitters.describe().iloc[:,:4]
AtBat Hits HmRun Runs
count 322.000000 322.000000 322.000000 322.000000
mean 380.928571 101.024845 10.770186 50.909938
std 153.404981 46.454741 8.709037 26.024095
min 16.000000 1.000000 0.000000 0.000000
25% 255.250000 64.000000 4.000000 30.250000
50% 379.500000 96.000000 8.000000 48.000000
75% 512.000000 137.000000 16.000000 69.000000
max 687.000000 238.000000 40.000000 130.000000