Saturday, January 11, 2014

Please, Ned Yost, don't bat Omar Infante Second


Bat him cleanup.

Hear me out. I'm not going to claim that Infante has what it takes to be a "run producer," as opposed to the "bat control guy" that Ned Yost wants. I'm saying he's neither of these things, and given what the Kansas City Royals have to work with, that's why he's a reasonable choice for the 4-spot.

In a previous post, I described some software tools that I created for simulating baseball games given a lineup of players with known statistics. These stats include both batting,  baserunning, and base stealing. It runs through all the various lineup combinations and determines the average runs per game for each. There are 362,880 different combinations for a sample of nine players, and running through all of them takes significant computing time. After a few trial runs, my simulations yielded the common-sense outcome that Moustakas/Cain/Escobar should always be relegated to the 7/8/9 slots, preferably in that order[1]. So to make subsequent calculations run faster, I fixed those three players' slots in the lineup. This reduces the number down to only 720 possible lineups from the combinations of six players.

[1] If you don't agree that this is a logical and obvious solution, I would tell you to stop reading... but your name is probably "Ned" and you're the whole reason I'm writing this post to begin with. So please keep reading, Ned. Humor me.

Remember from my previous post that there is no one optimal lineup---my break point for "statistically indistinguishable" in the baseball sense is one run over a 162-game season. For each set of simulations, up to several 10's of lineups will make that cut. Below I'll show results for several criteria on player performance. Each figure shows one panel for each player, with a histogram indicating the number of times that player hits in each slot for the set of optimal lineups. The order of the panels is the lineup order of the top-ranked lineup.

As inputs I take the statistics for each player summed over the last three years (two years for Aoki), 2011-2013. This assumption implies mild bounce-back years from both Butler and Gordon, and a slight up-tick from Escobar's dismal 2013 (essentially he'd reproduce 2011). It also assumes that Moustakas and Cain are what they are after 1,500+ plate appearances. Aoki may only be a third-year MLB player, but he's also 32 and isn't likely to surpass what he's done previously. The only exception I make for significant change is Eric Hosmer. Perhaps it belies optimistic homerism on my part, but it did seem that he turned the corner after the first two months of the season, and his monthly splits thereafter showed consistent above-average production. I assume that 2014 Eric Hosmer is second-half 2013 Hosmer, which would make him the best hitter on the team. In KC there's optimism that Perez is another candidate for improvement, his early numbers were off the charts, so a 3-year average also represents a step above his 2014. Each player's wOBA is shown in each figure.

It should be noted that this lineup doesn't take into account lefty/righty platoon splits of the hitters, so these results might be better thought of as optimization against right-handed pitchers, which most stats are collected against. I also don't implement any hit-and-runs---it is, I think, impossible to collect the proper data for this: what fraction of caught-stealing are blown hit and runs? What fraction of hit and runs are actually just straight steals? What about hit and runs that just didn't work? Also, the code doesn't care about putting multiple left-handers in a row, something I know Yost didn't like about the  in-house lineup analysis done last season. I think a strong argument can be made that this shouldn't be a consideration. Yes, a LOOGY can come in and get three straight outs, but why would you implement a sub-optimal lineup for the first 6-7 innings just to get a slight advantage in the 8th? (Most closers are right-handed, so it's just one or two innings.)

It is interesting to look at the results when adding each aspect of offensive play to the simulator. For simplicity, start by considering only what each player does at the plate. Everything after ball-makes-contact is considered to be league average for each player in the lineup. In that case, we have the straightforward task of trying to maximize the number of times your best players bat, as well as the number of times they bat in the same inning. Results are shown in the first figure below.

Results when considering only plate performance. All baserunning is MLB average. The number below each player's name is their weighted on-base average (wOBA), a mainstay in FanGraphs WAR calculations. Although each player can fit into many slots in the lineup, the ordering of the panels is the order of the best overall lineup.
Now, before you start laughing, this isn't the last time you're going to see Butler in the leadoff spot in this post. But taking into consideration only plate performance, sticking Butler/Gordon/Hosmer at the top of the order will produce a lot of runs. Infante, having the lowest wOBA, gets relegated to the six-spot.

Now let's add baserunning to the equation. Here, baserunning means taking the extra base (including scoring from second on a single, which happens ~60% of the time) and base stealing, but no double plays.


Lineup results when including baserunning, with the exception of grounding into double plays.
Hosmer is an above-average base runner, and combined with his hitting he jumps to the top of the order. Gordon, as before, can fit in almost any slot. But adding baserunning pushes Butler down in many of the best lineups. You can still get good production with him at #1 and #3, if you order the other guys properly. Now lets put double plays into the equation and see what happens.
 

Lineup results when including baserunning and double plays.

Gordon/Hosmer/Aoki/Infante is not a surprising top four, but their order is the opposite of a conventional lineup construction. Perez and Butler are the worst baserunners on the team, with Perez having a GIDP rate nearly as high as Butler (but better taking the extra base). The results for Butler are quite intriguing: either bat him leadoff or bat him sixth. Why? The leadoff spot encounters the least number of double play opportunities relative to the rest of the lineup, usually by a wide margin. He also has the highest on-base percentage on the team. It'll take two hits or a home run to push him across the plate once he reaches first, but he'll be on first (or second) more often than any other player. If you don't put him there, you don't have much choice but to put him in the sixth spot, where double play opportunities are around league-average and his baserunning won't matter because no one behind him is going to drive him in anyway.


Rotochamp has offered their best guess at the opening-day lineup for the Royals (see chart above), putting Aoki/Infante one-two. This seems to be the general consensus in the media that the Royals think this as well. There are three problems with this lineup: (1) the Royals best hitters aren't reached until the 3rd spot. (2) Batting Hosmer fifth reduces both his plate appearances and RBI opportunities. This slotting alone is a major knock on this batting order. (3) Butler in the four spot yields double plays and exposes his deficiencies on the base paths. In the table I've also listed the top lineup as well as the best Butler-leadoff lineup for comparison. Taking all this into consideration, the consensus lineup costs the Royals 11 runs over the course of the season according to simulations I ran with this lineup. 

Eleven runs. More than one win. One win the Royals can't afford to lose.

No comments:

Post a Comment