Nationals Arm Race

"… the reason you win or lose is darn near always the same – pitching.” — Earl Weaver

Archive for February, 2013

Caltech ends record losing streak

leave a comment

Logo via wikipedia

Today’s distraction:

The word came out over the weekend: The California Institute of Technology’s division III baseball team snapped a 10 year, 228 game losing streak by beating fellow baseball powerhouse Pacifica over the weekend.

Which of course begs the question; CalTech has a baseball team!?

Written by Todd Boss

February 7th, 2013 at 11:20 am

Posted in College/CWS

Ladson’s Inbox 2/5/13

16 comments

Lots of questions about Gonzalez and Garcia this week. Photo unknown credit.

Hey, what great timing for another Bill Ladson inbox (posted 2/5/13).  Baseball news is light, pitchers and catchers report in a week or so, and I’m not quite ready to continue my Stats series.

As always, I write my answer before reading his, and sometimes edit questions for clarity:

Q: Do you think general manager Mike Rizzo will add starting pitching depth before Opening Day? Does the lingering possibility of a Gio Gonzalez suspension change whom the Nationals would consider acquiring?

A: In Ladson’s 1/22/13 mailbag, some one asked what could prevent the Nats as constructed from winning the World Series in 2013.  I answered Rotation Injuries and Luck.  Well, in the wake of the Miami PED scandal, I guess the third answer may be “PED scandal.”

This is a tough question to answer; Gio Gonzalez has denied the rumors, but the newspaper in question (the Miami New Times) clearly only named Gonzalez because they felt like the evidence they had in hand was irrefutable.  Many other players have not been named.  So as a GM; how do you go about preparing for 2013 at this point?   If Mike Rizzo knows that Gonzalez is getting suspended, you have to think he’s on the horn to his buddy Scott Boras about possibly buying Kyle Lohse, which is clearly the best remaining FA starter.  But Lohse isn’t coming cheap, and likely isn’t coming on a one year deal, and would cost another draft pick (I believe).  The Nats are already topping $120M in payroll; would they go to $135M?

If we think Gio at least gets a pass and the suspension is put off, maybe Rizzo’s recent activities of signing random starters to minor league contracts is going to be sufficient.

Ladson mentions Javier Vazquez and the ever-present rumors of Christian Garcia going to the rotation as possible Gonzalez replacements if he gets suspended quickly.  Probably fair; Vazquez may be a great, cheap alternative.

Q: Everyone is saying that it’s going to be a two-team race in the National League East between the Nationals and Braves. Do you think the Phillies have a shot to contend with both these teams, or is their time done?

A: Boy, its hard to look at the aging, expensive Phillies lineup they had in place in 2012, which suffered injuries and setbacks and creaked their way to a .500 record, and then look at the highly questionable slew of acquisitions and signings this off season (Ben Revere, John Lannan, Michael YoungDelmon Young and everyone’s favorite anti-gay advocate Yuniesky Betancourt) and not, well, giggle at where this team is going.  My favorite baseball joke from the off-season goes like this: “The Phillies wanted to get Younger this off-season, so they signed Michael Young and Delmon Young.”

The two Youngs were both negative WAR players last year, Lannan is a 5th starter, Revere was a backup centerfielder who the Phillies traded some decent assets for, and Betancourt is who he is (though admittedly he’s on a minor league deal and seems at best set to be a utility infielder behind starters Jimmy Rollins and Chase Utley).  I see the Phillies being a very bad defensive team with the two Youngs in the starting lineup, I see some serious questions in the back side of the rotation, and I see continued regression and louder complaints about Ryan Howard‘s contract.  Fun times a-coming in Philadelphia.  Ladson actually says that the Phillies will “be improved with Michael Young.”  Bill!  Have you seen Young’s WAR figures from 2012??  He was a NEGATIVE WAR player at both major War sites.  That means he makes your team worse!    Now, he was completely servicable in 2011 … so if you want to make the argument to me that 2012 was an aberration for an aging hitter playing in a hitter’s park, well I guess that’s a stance you can take.  But pretty much every other pundit in the blogosphere has loudly criticized the Philadelphia moves this off-season.

Q: What is the status of Lucas Giolito? When do you see him pitching in D.C.?

A: Tommy John surgery in Late August (I can’t remember the exact date; it was 8/24/12 when I posted this highly-critical article about Lucas Giolito and the situation), so figuring a typical 12-month rehab session before he’s actively throwing again in pro-games basically puts him at the end of the 2013 minor league season.  Which means he’ll be 20 before he really is ready to start his pro career in the spring of 2014.  Figure 4-5 years average case for typical high schoolers to work their way up the systems (perhaps fewer years given his talents and pedigree, as we’ve seen with someone like Dylan Bundy in 2012, who made his way from low-A to AA in his first pro season out of HS and got a late Sept callup to the majors) and we’re probably looking at 2016-2017 before seeing him in the majors.  If, of course, he recovers from surgery, hasn’t destroyed his mechanics, is effective, matures, doesn’t get re-injured, or any of the million other pitfalls that typically befall high school arms drafted in the upper rounds.  Ladson thinks he’s pitching pro games “after the all-star break” and is in the majors in 3 years.  Wow.  That is optimistic.

Q: How do you think Henry Rodriguez will do? And what do you think his role in the bullpen will be?

A: I am, and always have been, pessimistic on Henry Rodriguez.  I hated the Willingham trade that got him here.  He’s forced the team to invent injuries to stash him on the DL coming out of spring training b/c he has no options.  He led the league in wild pitches in 2011 in just 65 innings.  He had a 69 ERA+ in 2012.  At some point when does the team say, “OK, its nice that he throws 100mph.  But enough is enough; we need a reliable pitcher who can deliver when called upon.”  Perhaps Spring Training 2013 is that time.

What do I think his role will be?  I’m sure he’ll look great in Spring Training again, will break camp with the team, and very well may look halfway decent for a while.  But just like every other season, he’s going to have those 3-walk outings where he pitches a 1/3 of an inning and gives up 4 runs, and then the manager will be afraid to use him unless the team has a 5-run lead.  And eventually we’ll call up Garcia to replace him and move on.  That’s my prediction for Rodriguez.  Ladson says the team should “attempt to trade him if he is not impressive this spring.”  Wow, that’s sage advice; if only every team could trade its under-performing players and actually get value back whenever it wanted.

Q: Can you predict Washington’s Opening Day lineup if all available players are healthy?

A: Easy.  I’ll even predict the batting order.  Span-Werth-Harper-Zimmerman-LaRoche-Desmond-Espinosa-Suzuki-Strasburg.  Ladson predicts the same names but in a lineup order that makes no sense from a lefty-righty balance perspective.

Q: After announcing his retirement, do you think Brian Schneider is a possible candidate to replace Johnson as manager of the Nationals?

A: Wow, yet another speculative question about the future Nationals Manager.   He took a question about the manager on 1/28/13, and on 1/22/13.  And on 1/14/13.   I guess people like speculating on the Nats next manager.  Not repeating what i’ve said on the topic before, is Brian Schneider a candidate?  Why would he possibly be a candidate to manage the major league team of a system he left 5 years ago?  Why would the Nats pick a manager who’s never managed a day in his life?   Ladson breathes some common sense on this one.

Q: I think Garcia has to be on the Opening Day roster, so is he in the bullpen or someplace else? Can the 25-man roster accommodate him and all the other pitchers?

A: “Someplace else?”  Like where?  In the outfield?   I like Garcia too, but the team has a numbers problem in the bullpen.  Storen, Clippard, Mattheus, and Stammen have all more than earned their spots.  Soriano is being paid a ton of money.  Duke is guaranteed a spot (he’s the only lefty and he’s got enough service time to refuse a demotion).  Oh, and Rodriguez has no options.  So there’s your 7-man bullpen.  Notice there’s only one left-hander out there; if you believe that you need left-handers to get left-handed batters out, then the bullpen needs to sacrifice one of the righties in order to have a second lefty (Bill Bray?) in there.

The only way I see Garcia making this bullpen is if the team runs out of patience with Rodriguez and DFAs/DLs him, or if the team trades away one of their closer-quality surplus guys, or if maybe someone like Mattheus/Stammen (both of whom do have options) struggles or gets hurt.  Otherwise look for Garcia to get stretched out and get looks as a starter in AAA.  Ladson says he’s confident Garcia is on the 25-man roster …. ok explain it to me then based on the above paragraph.  Who is he replacing?

Stats Overview Part II: Hitting stats on the rise: wRC, wOBA, etc

2 comments

Trout's BABIP was very high in 2012; what does this mean for 2013? Photo Gary Vasquez/US Presswire via espn.com

(Part 2 in a series: Part 1 talked about Whats Wrong with Old School Baseball Stats).

More and more in modern baseball writing, you see relatively new statistical creations thrown into articles in order to prove or disprove an opinion, and more and more you almost need a glossary to properly read these articles and properly understand what the author is attempting to say.  I always want to understand that which I read, and at the same time I want to make sure I stay current and up-to-date on the stats out there, so I decided to do a little research (and pen my own post while I was at it) into some of these newer stats that are being used.

I’ll write about each stat, give links to its calculation, write about how it may be used, then put in some rules of thumb by which to consider the stat.

Pretty much every stat here is defined and available at either Baseball-reference.com or fangraphs.com.  BaseballProspectus.com also has some more obscure stats discussed further below.  I’ve always thought that B-R’s interface was so much easier to navigate that I tend to search there first, but a more complete set of stats is at fangraphs.

1. BABIP.  Batting Average on Balls in Play.  Most people know this one, but it is an important stat to consider in conjunction with other stats (especially the older Batting Average and Earned Run Average).   The calculation, as it is seen at Wikipedia, measures basically how many balls put into play (removing from consideration home runs) turned into hits.  Interestingly it penalizes the hitter for hitting sacrifice flies (not sure why).  This stat is kept for both individual hitters and for pitchers.

How is BABIP used? The measurement is essentially used as a checkpoint for fluky seasons.  If a pitcher has a very high ERA but also has a very high BABIP, one can explain that he’s been unlucky and his talent level lays somewhere below his posted performance on the year.  Ironically, the two leaders of Pitcher BABIP in 2012 were both on the Tigers; Rick Porcello and Max Scherzer had BABIPs of .344 and .333 respectively; this delta is probably going to lead to both of these guys having better ERAs in 2013.  If a hitter has a decent hitting season but also has a high BABIP, one usually says that the hitter was “lucky” and is due to regress (Mike Trout in 2012 had a Babip of .383.  That’s really high, probably unsustainably high, and he probably regresses statistically in 2013).

MLB Average/Rule of thumb: .290-.300 depending on the year.

When BABIP is high: a hitter is considered to be “lucky,” and future regression of more batted balls being turned into outs is expected.

When BABIP is low, a hitter is considered to be “unlucky,” and future improvement of more hits on batted balls is expected.

Caveats using BABIP: there are many arguments about whether some pitchers “baseline BABIP” should be modified based on their talent or capabilites.  For example, Mariano Rivera‘s career BABIP is .262 while R.A. Dickey‘s BABIP since he turned into a Mets knuckleballing starter is around .275.  Rivera’s lower baseline is probably attributed to his amazing cutter and his pure skill, while Dickey’s is most likely due to the fluctuations of hitting his knuckleball.  Meanwhile, some hitters maintain higher than average career BABIPs (two extreme examples that immediately come to mind are Ichiro Suzuki and Nyjer Morgan, with career BABIPs of .347 and .336 respectively.  Why so high?  Because both are skilled at bunting (or at least hitting choppy grounders) for base hits, artificially inflating their baseline BABIP.

2. ISO; Isolated Power.  As posted on Wiki, Isolated power can be simply calculated by subtracting a hitter’s batting average from the slugging percentage, or as it is more eloquently defined at FanGraphs, ISO is essentially a measure of how many extra base hits a batter hits per at-bat.  Slugging tells you how many bases per at bat a hitter obtains, but ISO strips out singles to isolate a player’s capability of hitting doubles, triples and homers.  Here’s a couple of decent examples from 2012; our own Bryce Harper hits a ton of extra base hits; he’s posted a .206 ISO for the 2012 season.  Meanwhile we know that the aforementioned Nyjer Morgan is not a very powerful hitter and ISO shows it; he’s at .069 for the 2012 season.  The league leaders for ISO reads like a list of MLB’s best sluggers.  Giancarlo Stanton would have led the league in ISO had he qualified; he posted a fantastic .318 ISO in 2012.

How is it used? ISO is used to measure how good a hitter is at getting extra-base hits.

MLB Average/Rule of thumb (from Fangraphs page) .145 is considered an “average” MLB ISO figure.  .200 is pretty good, .100 is poor.

Caveats using ISO: as with many sabremetric-tinged stats, small sample sizes greatly skew the figures.  Fangraphs says 550 ABs is needed before really drawing any judgements.

3. wOBA; Weighted On Base Average.  Created by Tom Tango, wOBA is a relatively newer statistic that attempts to improve upon the traditional batting statistics we use (Batting Average, Slugging and On Base Percentage) by measuring cumulative “weighted” hits that a batter may achieve.  It is based on the premise that the three traditional stats just mentioned all treat hitting events relatively the same.  Is a single equal to a double?  No, but in Batting Average it is.  Is a double worth half as much as a home-run?  No, but in the Slugging Percentage it is.  Each hitting event is weighted and added together, with increases/decreases for stolen bases/caught stealing thrown in, to arrive at a measurement that attempts to better quantify pure hitting.

How is it used?  wOBA attempts to be set to the same scale as the league wide OBP, which seems to hover around .315-.320 year to year.

MLB Average/Rule of thumb (from Fangraphs page) .320 is a good “league average” number.  .370 is great, .300 is poor.

Caveats using wOBA: There are several to keep in mind; the weights change  year to year, in order to normalize the stat across generations.  It is NOT normalized to park factors, so hitters in places like Boston and Colorado will have artificially inflated wOBAs to their true value.  Lastly, there’s zero context given to the game situation when measuring hits (i.e. was there a guy on third with one out?  Was it a close game in the 9th?)  I think particular situation is nearly impossible to measure in any stat, but it is important.

4. RC/wRC: Runs Created and Weighted Runs Created.  Runs Created is a stat that Bill James invented in one of his earlier Baseball Abstracts (1985) in order to try to measure simply how many runs an individual player contributed to the team in a given season.  It was improved upon vastly in 2002 to be much more detailed and accurate; the original version over-emphasized some factors of hitting.  It is a complicated statistic (see its wiki page for the formula).  The aforementioned Tom Tango improved upon the basic RC by creating the Weighted version of the statistic based on his own Weighted OBA statistic (which he believed more closely measures the proper “value” of each hitting event).

How is it used?  Individually, RC and wRC need to be understood in context of an entire season.  It isn’t until we get to wRC+ (see below) that a side-by-side comparison is capable.  Its like saying “Player X has 105 hits.”  If that’s through 75 games, that’s pretty good; if that’s for an entire season, well that’s pretty poor.

MLB Average/Rule of thumb (from Fangraphs page) RC and wRC both have roughly the same scales.  60 is average, 100 is great, 50 is poor for a full season.

Caveats using RC and wRC: They are basically full season counting numbers.  In 2012, Trout started in the minors, so his RC and wRC totals are less than his MVP competition Miguel Cabrera.

5. wRC+/wRAA: Weighted Runs Created Plus/Weighted Runs Above Average

wRAA and especially wRC+ are touted by fangraphs.com as being very good “single number” statistics to properly measure a player’s hitting ability.  I often use OPS+ as a singular number to measure a hitter; fangraphs specifically calls out this number and recommends using wRC+.

How is it used?  Both numbers basically measure the same thing.  wRC+ is a bit easier to explain; 100 is the league baseline, and points above or below the average are expressed as “percentage points above or below the league average.”  So, a person with a 120 wRC+ is considered to be 20% better at creating runs than the average major leaguer.  Cabrera and Trout ironically tied for the MLB lead for 2012 in wRC+, each posting a 166 wRC+.  Meanwhile wRAA (per fangraphs.com) “measures the number of offensive runs a player contributes to their team compared to the average player” and is scaled to zero.  wRAA is essentially a direct calculation from wOBA, so if you’re using one you can likely ignore the other.

MLB Average/Rule of thumb (from Fangraphs page) for wRC+: 100 is average while for wRAA zero (0) is average.  20-25 percentage points above is great, while 15-20 percentage points below is bad.

Caveats for using: Unlike wOBA, wRC+ is park- and league-adjusted, indeed making it an excellent single number by which to measure players.  Otherwise the caveats for these weighted averages are all about the same; they seem to be based on an weighting of hits that you may or may not agree with.


What have I learned from looking into these hitting stats?  I need to keep BABIP in mind.  I like ISO but I don’t see it gaining real credence over slugging percentage.  And I should probably start using wRC+ more than OPS+.

Part 3 coming up on Pitching advanced stats.

Cal Ripken Burgers??

leave a comment

I'm sure they're really tasty. Photo unknown via baltimoresports.report.com

Because when I think of frozen burger patties, I think of Cal Ripken Jr.

I was shopping at Giant over the weekend and did a double-take in the frozen food aisle.  Did you guys know that Ripken is hawking a line of burgers?  I didn’t.  A bit of google work shows that they’ve been on the market since May of 2012.

Is it just me, or is this a relatively random athlete-food connection?  This would be like Reggie Jackson having his own candy bar.  Oh wait… 🙂

Written by Todd Boss

February 3rd, 2013 at 2:34 pm

Stats Discussion Part I: What’s wrong with Wins and RBI?

15 comments

Cabrera's MVP award was thought to be on the backs of "bad stats." Is this a bad thing in general? Photo AP via sportingnews.com

(First Article in a series discussing Baseball Statistics that I mostly wrote months ago and was waiting for downtime to post.  As it happens, the posts that I have in the can for months on end tend to get rather bloated; this one is > 3000 words.  Apologies in advance if you think that’s, well, excessive).

(Note: a good starting point/inspiration for this series was a post from February 2012 on ESPN-W by Amanda Rykoff, discussing some of the stats used in the movie Moneyball.  Some of the stats we’re discussing in the next few posts are covered in her article).

The more you read modern baseball writing, the more frequently you see the inclusion of “modern” baseball statistics interspersed in sentences, without definition or explanation, which are thus used to prove whatever point the writer is making.   Thus, more and more you need a glossary in order to read the more Sabr-tinged articles out there.  At the same time, these same writers are hounding the “conventional” statistics that have defined the sport for its first 100 years and patently ridiculing those writers that dare use statistics like the RBI and (especially as of late) the pitcher Win in order to state an opinion.  This is an important trend change in Baseball, since these modern statistics more and more are used by writers to vote upon year ending (and career defining) awards, and as these writers mature they pour into the BBWAA ranks who vote upon the ultimate “award” in the sport; enshrinement into the Hall of Fame.

This year’s AL MVP race largely came down to the issue of writers using “old-school” stats to value a player (favoring Miguel Cabrera and his triple-crown exploits) versus “new-school” stats to value a player (favoring Mike Trout, who may not have as many counting stats but has put in a historical season in terms of WAR).  And as we saw, the debate was loud, less-than-cordial, and merely is exacerbating a growing divide between older and newer writers.  This same argument is now seen in the Hall of Fame voting, and has gotten so derisive that there are now writers who are refusing to vote for anyone but their old-school stat driven pet candidates as a petulant reaction to new-school writers who can’t see the forest for the trees in some senses.

A good number of the stats that have defined baseball for the past 100 years are still considered “ok,” within context.  Any of the “counting stats” in the sport say what they say; how many X’s did player N hit in a season?  Adam Dunn hit 41 homers in 2012, good for 5th in the league.  That’s great; without context you’d say he’s having a good, powerful season.  However you look deeper and realize he hit .204, he didn’t even slug .500 with all these homers and he struck out at more than a 40% clip of his plate appearances.  And then you understand that perhaps home-runs by themselves aren’t the best indicators of a player’s value or a status of his season.

Lets start this series of posts with this topic:

What’s wrong with the “old school” baseball stats?

Most old school stats are “counting” stats, and they are what they are.  So we won’t talk about things like R, H, 2B, 3B, HR, BB, K, SB/CS.  There’s context when you look at some these numbers combined together, or if you look at these numbers divided by games or at-bats (to get a feel for how often a player hits a home run or steals a base or strikes out a guy).  In fact, K/9, BB/9 and K/BB ratios are some of my favorite quick evaluator statistics to use, especially when looking at minor league arms.  But there are some specific complaints about a few of the very well known stats out there.  Lets discuss.

1. Runs Batted In (RBI).   Or as some Sabr-critics now say it, “Really Bad Stat.”   The criticism of the RBI is well summarized at its Wiki page; it is perceived more as a measure of the quality of the lineup directly preceeding a hitter than it is a measure of the value of the hitter himself.  If you have a bunch of high OBP guys hitting in front of you, you’re going to get more RBIs no matter what you do yourself.  Another criticism of the stat is stated slightly differently; a hitter also benefits directly from his positioning in the lineup.  A #5 hitter hitting behind a powerful #4 hitter will have fewer RBI opportunities (in theory), since the #4 hitter should be cleaning up (no pun intended) the base-runners with power shots.  Likewise, a lead-off hitter absolutely has fewer RBI opportunitites than anyone else on the team; he leads-off games with nobody on base, and hits behind the weakest two hitters in the lineup every other time to bat.

I’m not going to vehemently argue for the RBI (the points above are inarguable).  But I will say this; statistical people may not place value on the RBI, but players absolutely do.  Buster Olney touched on this with an interesting piece in September that basically confirms this;  if you ask major leaguers whether RBIs are important you’ll get an across-the-board affirmative.  Guys get on base all the time; there’s absolutely skill and value involved in driving runners home.  Guy on 3rd with one out?  You hit a fly ball or a purposefully hit grounder to 2nd base and you drive in that run.  Players absolutely modify the way that they swing in these situations in order to drive in that run.  And thus RBI is really the only way you can account for such a situation.  The Runs Created statistics (the original RC plus the wRC stats) don’t account for this type of situation at all; it only measures based on hits and at-bats.

(As a side-effect, the statistic Ground-into Double Plays has a similar limitation to RBI: it really just measures how many batters were ahead of you on base as opposed to a hitter’s ability to avoid hitting into them.  But thankfully GIDP isn’t widely used anywhere).

2. Batting Average (BA): The isolated Batting Average is considered a “limited” stat because it measures a very broad hitting capability without giving much context to what that hitter is contributing to the end goal (that being to score runs).  A single is treated the same as a home run in batting average, despite there being a huge difference between these two “hits” in terms of creating runs.  This is exemplified as follows: would you rather have a .330 hitter who had zero home runs on the season, or a .270 hitter who hit 30 home runs?   Absolutely the latter; he’s scoring more runs himself, he’s driving in more runs for the team, and most likely by virtue of his power-capability he’s drawing more walks than the slap hitting .330 hitter.  More properly stated, the latter hitter in this scenario is likely to be “creating more runs” for his team.

Statistical studiers of the game learned this limitation early on, and thus created two statistics that need to go hand in hand with the Batting Average; the On-Base Percentage (OBP) and the Slugging Percentage (SLG). This is why you almost always see the “slash line” represented for hitters; to provide this context.  But, be careful REPLACING the batting average with these two numbers (or the OPS figure, which represents On-Base percentage + Slugging).  Why?  Because Batting Average usually comprises about 80% of a players On base percentage.  Even the highest walk guys (guys like Adam Dunn or Joey Votto) only have their walk totals comprising 17-18% of their OBP.  If you sort the league by OBP and then sort it by BA, the league leaders are almost always the same (albeit slightly jumbled).  So the lesson is thus; if someone says that “Batting Average is a bad stat” but then says that “OBP is a good stat,” I’d question their logic.

Lots of people like to use the statistic OPS (OBP+SLG) as a quick, shorthand way of combing all of these stats.  The caveat to this is thus; is a “point” of on-base percentage equal to a “point” of slugging?  No, it is not; the slugging On Base Percentage point is worth more because of what it represents.  Per the correction provided in the comments, 1.7 times more.

Coincidentally, all of the limitations of BA are attempted to be fixed in the wOBA, which we’ll discuss in part 2 of this series.

3. ERA: Earned Run Average.  Most baseball fans know how to calculate ERA (earned runs per 9 innings divided by innings pitched), and regularly refer to it when talking about pitchers.  So what’s wrong with ERA?

Specifically, ERA has trouble with situations involving inherited runners.  If a starter leaves a couple guys on base and a reliever allows them to score, two things happen:

  • those runs are charged to the starter, artifically inflating his ERA after he’s left the game.
  • those runs are NOT charged to the reliever, which artificially lowers his ERA despite his giving up hits that lead to runs.

ERA is also very ball park and defense dependent; if you pitch in a hitter’s park (Coors, Fenway, etc) your ERA is inflated versus those who pitch in pitcher parks (Petco, ATT).  Lastly, a poor defense will lead to higher ERAs just by virtue of balls that normally would be turned into outs becoming hits that lead to more runs.  Both these issues are addressed in “fielder independent” pitching stats (namely FIP), which are discussed in part III of this series.

A lesser issue with ERA is the fact that it is so era-dependent.  League Average ERAs started incredibly high in the game’s origin, then plummetted during the dead ball era, rose through the 40s and 50s, bottomed out in the late 60s, rose slightly and then exponentially during the PED era and now are falling again as more emphasis is placed on power arms and small-ball.  So how do you compare pitchers of different eras?  The ERA+ statistic is great for this; it measures a pitcher’s ERA indexed to his peers; a pitcher with an 110 ERA+ means that his ERA was roughly 10% better than the league average that particular year.

4. Pitcher Wins.  The much maligned “Win” statistic’s limitations are pretty obvious to most baseball fans and can be stated relatively simply; the guy who gets the “Win” is not always the guy who most deserves it.  We’ve all seen games where a pitcher goes 7 strong innings but his offense gives him no runs, only to have some reliever throw a 1/3 of an inning and get the Win.  Meanwhile, pitchers get wins all the time when they’ve pitched relatively poorly but their offense explodes and gives the starter a big lead that he can’t squander.

Those two sentences are the essence of the issue with Wins; to win a baseball game requires both pitching AND offense, and a pitcher can only control one of them (and his “control” of the game is lost as soon as the ball enters play; he is dependent on his defense to get a large majority of his outs, usually 60% or more even for a big strike out pitcher).  So what value does a statistic have that only measures less than 50% of a game’s outcome?

The caveat to Wins is that, over the long run of a player’s career, the lucky wins and unlucky loseses usually average out.  One year a guy may have a .500 record but pitch great, the next year he may go 18-3 despite an ERA in the mid 4.00s.  I have to admit; I still think a “20-game winner” is exciting, and I still think 300 wins is a great hall-of-fame benchmark.  Why?  Because by and large wins do end up mirroring a pitcher’s performance over the course of a year or a career.  The downside is; with today’s advances in pitcher metrics (to be discussed in part III of this series), we no longer have to depend on such an inaccurate statistic to determine how “good” a pitcher is.

Luckily the de-emphasis of Wins has entered the mainstream, and writers (especially those who vote for the end-of-year awards) have begun to understand that a 20-game winner may not necessarily be the best pitcher that year.  This was completely evident in 2010, when Felix Hernandez won the Cy Young award despite going just 13-12 for his team.  His 2010 game log is amazing: Six times he pitched 7 or more innings and gave up 1 or fewer runs and got a No Decision, and in nearly half his starts he still had a “quality start” (which we’ll talk about below briefly).  A more recent example is Cliff Lee‘s 2012 performance, where he didn’t get a win until July, getting 8 no-decisions and 5 losses in his first 13 starts.  For the year he finished 6-9 with a 3.16 ERA and a 127 ERA+.  Clearly Lee is a better pitcher than his W/L record indicates.

(Coincidentally, I did a study to try to “fix” pitcher wins by assigning the Win to the pitcher who had the greatest Win Percentage Added (WPA).  But about 10 games into this analysis I found a game in April of 2012 that made so little sense in terms of the WPA figures assigned that I gave up.  We’ll talk about WPA in part 4 of this series when talking about WAR, VORP and other player valuation stats).

5. Quality Starts (QS) Quality Starts aren’t exactly a long standing traditional stat, but I bring them up because of the ubiquitous nature of the statistic.  It is defined simply as a start by a pitcher who pitches 6 or more innings and who gives up 3 or less earned runs.   But immediately we see some issues:

  1. 6 IP and 3 ER is a 4.50 ERA, not entirely a “quality” ERA for a starter.  In fact, a starter with a 4.50 ERA in 2012 would rank  him 74th out of 92 qualified starters.
  2. If a pitcher pitches 8 or 9 complete innings but gives up a 4th earned run, he does not get credit for the quality start by virtue of giving up the extra run, despite (in the case of a 9ip complete game giving up 4 earned runs) the possibility of actually having a BETTER single game ERA than the QS statistic defaults to.

Why bring up QS at all?  Because ironically, despite the limitations of the statistic, a quality start is a pretty decent indicator of a pitcher’s performance in larger sample sizes.  Believe it or not, most of the time a quality start occurs, the pitcher (and the team) gets the win.  Take our own Gio Gonzalez in 2012; he had 32 starts and had 22 quality starts.  His record? 21-8.  Why does it work out this way?  Because most pitchers, when you look at their splits in Wins versus Losses, have lights out stuff in wins and get bombed during losses.  Gonzalez’s ERA in wins, losses and no-decisions (in order): 2.03, 5.00 and 4.32.  And, in the long run, most offenses, if they score 5 or more runs, get wins.  So your starter gives up 3 or fewer runs, hands things over to a bullpen that keeps the game close, your offense averages 4 runs and change … and it adds up to a win.

I used to keep track of what I called “Real Quality Starts (rQS)” which I defined as 6 or more IP with 2 or fewer earned runs, with allowances for a third earned run if the pitcher pitched anything beyond the 6 full innings.  But in the end, for all the reasons mentioned in the previous paragraph, this wasn’t worth the effort because by and large a QS and a rQS both usually ended up with a Win.

6. Holds: A “hold” has a very similar definition as the Save, and thus has the same limitations as the Save (discussed in a moment).  There was a game earlier this season that most highlights the issues with holds, as discussed in this 9/21/12 post on the blog Hardball Times.  Simply put; a reliever can pitch pretty poorly but still “earn” a hold.

Holds were created as a counting stat in the mid 1980s in order to have some way to measure the effectiveness of middle relievers.  Closers have saves, but middle-relief guys had nothing.  The problem is; the hold is a pretty bad statistic.  It has most of the issues of the Save, which we’ll dive into last.

7. Saves. I have “saved” the most preposterous statistic for last; the Save.  The definition of a Save includes 3 conditions that a reliever must meet; He finishes a game but is not the winning pitcher, gets at least one out, and meets certain criteria in terms of how close the game is or how long he pitches.  The problem is that the typical “save situation” is not really that taxing on the reliever; what pitcher can’t manage to protect a 3 run lead when given the ball at the top of the ninth inning?  You can give up 2 runs, still finish the game, have a projected ERA of 18.00 for the outing and still get the save.  Ridiculous.  And that’s nothing compared to the odd situation where a reliever can pitch the final 3 innings of a game, irrespective of the score, and still earn a save.  In the biggest blow-out win of the last 30 years or so (the Texas 30-3 win over Baltimore in 2007), Texas reliever Wes Littleton got a save.  Check out the box score.

I wrote at length about the issue with Saves in this space in March of 2011, and Joe Posnanski wrote the defining piece criticising Saves and the use of closers in November 2010.  Posnanski’s piece is fascinating; my biggest takeaway from it is that teams are historically winning games at the exact same rate now (with specialized setup men and closers) that they were winning in the 1950s (where you had starters and mop-up guys).

I think perhaps the most ridiculous side effect of the Save is how engrained in baseball management it has become.  Relievers absolutely want Saves because they’re valued as counting numbers they can utilize at arbitration and free agency hearings to command more salary (I touched on in a blog post about playing golf with Tyler Clippard this fall; he absolutely wanted to be the closer because it means more money for him in arbitration).  Meanwhile, there are managers out there who inexplicably leave their closer (often their best reliever, certainly their highest paid) out of tie games in late innings because … wait for it … its not a save situation.  How ridiculous is it that a statistic now alters the way some managers handle their bullpens?

What is the solution?  I think there’s absolutely value in trying to measure a high leverage relief situation, a “true save” or a “hard save.”  Just off the top of my head, i’d define the rules as this:

  • there can only be a one-run lead if the reliever enters at the top of an inning
  • if the reliever enters in the middle of an inning, the tying run has to at least be on base.
  • the reliever cannot give up a run or allow an inherited run to score.

Now THAT would be a save.  Per the wikipedia page on the Save, Rolaids started tracking a “tough save” back in 2000, and uses it to help award its “Fireman of the Year” award, but searching online shows that the stats are out of date (they’re dated 9/29/11, indicating that either they only calculate the Tough Saves annually, or they’ve stopped doing it.  Most likely the former frankly).


Phew.  With so many limitations of the stats that have defined Baseball for more than a century, its no surprise that a stat-wave has occurred in our sport.  Smart people looking for better ways to measure pitchers and batters and players.

Next up is a look at some of the new-fangled hitting stats we see mentioned in a lot of modern baseball writing.