Monday, March 02, 2009

ERA as a Case for Sabermetrics



Statistics have always been one large draw of the game. They have been faithfully kept, poured over from the backs of baseball cards and in the back of the Sporting News in the old days to fantastic databases on-line. The traditional way of looking at statistics has always been static. In other words, the numbers are compared the same over generations as if the game has been played in a vacuum. The only exception to this has been the "Dead Ball Era" and the "Live Ball Era" as a breaking point in relative statistics. This, of course, has changed dramatically with the advent of what we generally call, "Sabermetrics." The basic underlying principle in all of these new statistics is that the game is not static nor has it been played in a vacuum. Stadiums vary, years vary, rules change, wars interfere, expansion happens and so forth. While there are a million things to talk about here, let's just use one statistic, ERA, as an example why these numbers are important.


In one generation, a pitcher might toil for sixteen years or so and compile a 3.80 ERA over that career. In the traditional way of looking at statistics, this pitcher is compared to another pitcher of a different generation who compiled a lifetime ERA of 2.80. The natural and traditional assumption is that the first pitcher wasn't nearly as good as the second. But what if the league ERA for the years the first pitcher toiled was 3.95 and the second pitcher pitched in a generation that averaged a 2.70 ERA? Doesn't that change things?

The traditional way of looking at things assumes that the game has always been played the same way under the same conditions and that the two ERAs of the two pitchers are apples and apples. That would assume that ERA has always been around the same. But it hasn't. As the charts below indicate, the league ERA figures have varied greatly over the years. For example, the average National League team scored 4.04 runs per game in 1968. But in 1970, that figure increased to 4.52 runs per game. In 2000, that figure was all the way up to 5.0 runs per game.

That may not seem like a whole lot of difference, but over the course of 162 games for each team, the difference is huge. It is just as huge as going from the Dead Ball Era to the Live Ball Era when 96 total home runs were hit in the American League for the entire year compared to 240 in 1919 and 525 in 1922.

The two leagues have also had very few years where the league ERA was in the same ballpark. So, for example, how can you compare Greg Maddux to Roger Clemens over their careers without taking into account the differences in the league ERA figures? And the same has to hold true for the different ballparks they pitched in. In order to come close to any kind of comparison, you have to have a way to factor in differences in league ERA, ballpark factors, rules differences (DH for example).

Okay, say you have a pitcher who pitched for a long time for the Houston Astros. During the first half of his career, he pitched in the Astrodome which was a huge disadvantage for hitters and then pitched the second half of his career in the Astros' new park which is much more favorable to the hitter. Don't you have to factor in the ballpark differences to evaluate the pitcher's effectiveness over his entire career. If you simply look at ERA as a static number, then the pitcher might have had an ERA of 3.30 in the Astrodome years and 3.75 in the new park. In actuality, those two figures might be similar in effectiveness if you take the park into account.

The Fan isn't going to get into the math because he is still trying to figure it out. But the logic makes sense that in the mid to late 1960s when an average team batting average was .235, a player like Yaz who led the league in batting one year by hitting .298 or something had just as good a year as a Tony Gwynn who batted .332 in a year when the average team batting average was .270. So these new numbers are very important to evaluate players not only today, but in retrospect to the past. That's why you have so many debates about the Hall of Fame now. We now have ways of putting careers into their proper perspective that we have never had before.

The ERA is just one stat that has fluctuated greatly over the years and from league to league. There is no way that the traditional way of looking at statistics can remain valid when comparing the value of players from year to year, from stadium to stadium, from league to league when they change so dramatically over time.



  • Year NL AL
    1901 3.32 3.66
    1902 2.78 3.57
    1903 3.26 2.96
    1904 2.73 2.60
    1905 2.99 2.65
    1906 2.62 2.69
    1907 2.46 2.54
    1908 2.34 2.39
    1909 2.59 2.47
    1910 3.02 2.52
    1911 3.39 3.34
    1912 3.40 3.34
    1913 3.19 2.93
    1914 2.79 2.73
    1915 2.75 2.93
    1916 2.61 2.82
    1917 2.71 2.66
    1918 2.76 2.77
    1919 2.91 3.22
    1920 3.13 3.79
    1921 3.78 4.28
    1922 4.10 4.03
    1923 3.99 3.98
    1924 3.87 4.23
    1925 4.26 4.39
    1926 3.83 4.02
    1927 3.91 4.14
    1928 3.99 4.04
    1929 4.71 4.24
    1930 4.97 4.64
    1931 3.86 4.38
    1932 3.88 4.48
    1933 3.33 4.28
    1934 4.06 4.50
    1935 4.02 4.46
    1936 4.02 5.04
    1937 3.91 4.62
    1938 3.78 4.79
    1939 3.92 4.62
    1940 3.85 4.38
    1941 3.63 4.15
    1942 3.31 3.66
    1943 3.38 3.30
    1944 3.61 3.43
    1945 3.80 3.36
    1946 3.41 3.50
    1947 4.06 3.71
    1948 3.95 4.29
    1949 4.04 4.20
    1950 4.14 4.58
    1951 3.96 4.12
    1952 3.73 3.67
    1953 4.29 3.99
    1954 4.06 3.72
    1955 4.04 3.96
    1956 3.77 4.16
    1957 3.88 3.79
    1958 3.95 3.77
    1959 3.95 3.86
    1960 3.76 3.87
    1961 4.03 4.02
    1962 3.94 3.97
    1963 3.29 3.63
    1964 3.53 3.63
    1965 3.54 3.46
    1966 3.60 3.43
    1967 3.37 3.23
    1968 2.99 2.98
    1969 3.60 3.62
    1970 4.05 3.71
    1971 3.46 3.46
    1972 3.45 3.06
    1973 3.66 3.82
    1974 3.62 3.62
    1975 3.62 3.78
    1976 3.50 3.52
    1977 3.91 4.06
    1978 3.57 3.76
    1979 3.73 4.21
    1980 3.60 4.03
    1981 3.49 3.66
    1982 3.60 4.07
    1983 3.63 4.06
    1984 3.59 3.99
    1985 3.59 4.15
    1986 3.72 4.17
    1987 4.08 4.46
    1988 3.45 3.97
    1989 3.49 3.88
    1990 3.79 3.90
    1991 3.68 4.09
    1992 3.50 3.94
    1993 4.04 4.32
    1994 4.22 4.80
    1995 4.18 4.71
    1996 4.22 5.00
    1997 4.21 4.57
    1998 4.24 4.65
    1999 4.56 4.86
    2000 4.63 4.91
    2001 4.36 4.47
    2002 4.11 4.46
    2003 4.28 4.52
    2004 4.30 4.63
    2005 4.22 4.35
    2006 4.49 4.56
    2007 4.43 4.50
    2008 4.30 4.36


No comments: