People care about batting orders. When I hear fans discussing their favorite teams, two of the most common types of comments I hear are "Why are they playing that bum?" and "Why are they batting that bum first (or second or third... )?" In the 1988 Baseball Abstract, Bill James summed up the view of the baseball research community on this subject:
"Several people, maybe a dozen, have done simulation studies of lineups, and have all (as far as I know) reported that it really doesn't make any difference, that one lineup is as good as another. I still don't buy it."
More studies have been done since that but what Bill James mentioned still seems to be the most common approach: take a team, pick out several potential good or bad batting orders, simulate a bunch of games and compare the number of runs scored using each approach. One notable exception to this was an article written in a 1991 issue of SABR's By The Numbers by Mark Pankin. He used a Markov process model to evaluate lineups.
In this article, I will attempt to follow in Mark's footsteps. I will be analyzing two lineups: the composite NL and AL lineups used from 1993 to 2004. I will use a form of the Markov process model to evaluate lineup strength, and will attempt to evaluate, not several, or a couple, or 100, or a few thousand, but all possible lineup combinations - all 362880 of them. I will be looking at a few things. First, what alterations should be made to our current thinking about lineup construction. How much variations do we see in the expected runs from different batting orders? I'll also look at one variation of the lineup taking into the account the handedness of the batters and see how that affects optimal lineup construction. Finally, I'll look at where the Giants should have been batting Barry Bonds these last few years.
As mentioned in the introduction, I will be analyzing the composite lineups used in each league from 1993 to 2004. I picked those years because they gave us a large sample size and because the offense has been at a relative high point for the entire period.
The composite starting lineup for the NL during those years:
NL AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 120334 32827 5831 1189 2075 11617 363 18400 1279 1018 662 5531 2210 .341 .393 2 117432 32312 5989 897 2605 10563 157 18568 1129 1822 750 2920 1151 .339 .408 3 111311 32687 6556 605 5253 14733 1510 19151 1142 193 1188 2485 902 .378 .505 4 109833 31322 6355 493 5474 13479 1693 21226 1149 70 1187 1714 803 .366 .502 5 109160 29844 6111 558 4379 11339 1024 20738 1003 262 1016 1680 918 .344 .460 6 107152 28567 5661 600 3714 9992 877 21022 1073 421 976 1683 917 .333 .435 7 104685 26559 5200 583 2826 9134 791 20719 1033 662 881 1244 713 .317 .396 8 100280 25044 4772 621 1948 10042 2217 19004 1000 957 842 1034 509 .322 .368 9 95204 17187 3076 247 1096 6276 262 30783 508 7312 489 504 273 .234 .253
The American League lineup:
AL AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 111052 30708 5562 1018 2232 11210 426 16929 1102 991 792 4798 1776 .346 .405 2 108386 29886 5685 762 2538 10461 203 17142 1019 1447 924 2958 1087 .342 .412 3 104126 30201 5958 495 4569 12681 1147 17799 1072 191 1242 1856 651 .369 .488 4 101653 28796 5880 329 5156 12735 1419 19293 1035 60 1119 980 506 .365 .500 5 100621 27514 5683 372 4125 11176 1015 18808 924 172 987 1073 616 .348 .460 6 99132 26561 5327 440 3607 9849 759 19320 947 358 792 1184 698 .337 .440 7 96691 25119 4939 483 2944 9004 630 18378 980 562 885 1225 717 .326 .412 8 94541 24157 4779 436 2358 7878 413 17757 939 982 795 1148 752 .317 .390 9 91926 23098 4398 539 1580 6809 106 16505 888 1682 753 1628 925 .307 .362
One pattern is clear here. The top three hitters in the lineup, both in terms of on-base and slugging percentage, hit 3rd, 4th and 5th, respectively. The players with the next two highest on-base percentages hit 1st and 2nd, followed by the rest of the hitters, in declining order of their on-base plus slugging percentages in the 6th through 9th slots.
Note that there are more plate appearances in the National League because they have had two more teams than the American League since 1998.
In general, Markovian approaches to baseball research analyze changes in game states. Readers of my Value Added article will already have seen something similar to this as will readers familiar with Mark Pankin's work. At the beginning of each play, we are in one of 24 game states (with outs going from 0 to 2 and men on going from bases empty to full). At the end of each play we are in one of 25 states (the 24 ones mentioned above as well as the 3 outs stage). Each player's offensive contribution can be seen, then, as a 24 x 25 matrix of state transitions caused by his at-bats. This might be best seen by example.
Here is the row of the transition matrix for the NL lead-off hitter corresponding to the no outs and no one on starting state:
Out --- F-- -S- FS- --T F-T -ST FST 0 927 15407 2798 0 561 0 0 0 1 36316 0 0 0 0 0 0 0 2 0 0 0 0 0 0 0 0 3 0 0 0 0 0 0 0 0
Lead-off hitters got up in these situations 56009 times (the sum of all the values in the table). Since there are only 5 transitions possible from this state, the table is rather sparse. 927 times, he left the state the same as he found it. In other words, he hit a home run or found some other way to circle the bases. He ended up on first, second and third, 15407, 2798 and 561 times, respectively. All the other times, 36316 of them, he made an out.
Note that we can determine the number of runs scored during each transition except ones that end in the final (three outs) state. In general, all runners (including the batter) that are not still on the bases or accounted for by an out have scored. In order to eliminate the exception with the final state, I'm going to expand that state to include the men left on base at the end of the inning.
Starting at a specific spot in the batting order, we will determine the number of runs scored in that inning by successively evaluating each state transition, pruning the list of transitions whenever it reaches a final state or drops below some probability threshold. To see how this works in action, here's the table above expressed in percentages:
Out --- F-- -S- FS- --T F-T -ST FST 0 1.66 27.51 5.00 0.00 1.00 0.00 0.00 0.00 1 64.84 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
As we analyze each inning, we need to determine two things: the amount of runs scored in the inning and a probability of lead-off hitters for the next inning. To determine this we need to maintain a list (with probabilities) of our next states, a running count of runs scored, as well as a list (with probabilities) of the lead-off batters in the next inning.
To demonstrate this, let's look at the first inning. We start with a very small lists of possible states: the lead-off batter at the plate with no on and none out. The probability of that state is 1.00, no runs have been scored and we have an empty list of next inning lead-off hitters. We run these states through the transition matrix for the batter hitting and update our data. Doing this for the first batter, yields the following list of next states:
Out MenOn Prob 0 --- 1.66 0 F-- 27.51 0 -S- 5.00 0 --T 1.00 1 --- 64.84
We now have scored .0166 runs in the inning (the probability of staying in the same state, since that yielded a single run), and we still have an empty list of the batters due up next inning.
This information is then used in evaluating the next batter's matrix. Here is the second hitter's matrix (expressed in percentages) for the five possible states:
Out=0 MenOn=--- Out --- F-- -S- FS- --T F-T -ST FST 0 2.19 26.41 5.28 0.00 0.77 0.00 0.00 0.00 1 65.35 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Out=0 MenOn=F-- Out --- F-- -S- FS- --T F-T -ST FST 0 1.93 0.03 1.94 19.89 0.90 6.35 3.30 0.00 1 0.08 38.12 15.43 0.00 0.34 0.00 0.00 0.00 2 11.70 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Out=0 MenOn=-S- Out --- F-- -S- FS- --T F-T -ST FST 0 1.22 5.17 4.06 11.20 0.77 10.81 0.36 0.00 1 0.28 1.28 30.75 0.00 33.34 0.00 0.00 0.00 2 0.75 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Out=0 MenOn=--T Out --- F-- -S- FS- --T F-T -ST FST 0 1.65 17.42 4.52 0.00 0.77 13.56 0.00 0.00 1 26.90 0.44 0.00 0.00 34.40 0.00 0.00 0.00 2 0.33 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 Out=1 MenOn=--- Out --- F-- -S- FS- --T F-T -ST FST 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 1.99 26.32 4.76 0.00 0.74 0.00 0.00 0.00 2 66.20 0.00 0.00 0.00 0.00 0.00 0.00 0.00 3 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00
So for each of these 5 five states, we compute the probability of reaching the next state, the number of runs scored during this at-bat and (although it's a little too soon for that) a list of the lead-off batters in the next inning along with the probabilities.
As you can imagine, this approach becomes pretty unwieldy for anything but a computer rather quickly, but to follow only one sequence in the inning, let's expand what happens when we evaluate the one out, no one on situation with the second batter at the plate. Remember, we had a probability or reaching this state of .6484. So the probability of reaching the two outs and no one on state at the conclusion of his at-bat would be .4292 (.6484 * .6620).
The third hitter's matrix for that situation:
Out --- F-- -S- FS- --T F-T -ST FST 0 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 1 0.00 0.00 0.00 0.00 0.00 0.00 0.00 0.00 2 3.92 27.25 5.09 0.00 0.47 0.00 0.00 0.00 3 63.27 0.00 0.00 0.00 0.00 0.00 0.00 0.00
This is the first time we have encountered a transition to the inning-ending state. Since we had a probability of .4292 of reaching this state to begin with, the probability of a 1-2-3 inning resulting in the clean-up hitter batting first the next inning would be .2716 (.4292 * .6327). So we add that information to our list of ending states.
We continue this analysis until all the threads have either reached a final state or dropped below a probability threshold. We need such a threshold because innings never have to end. It is possible that the first nine hitters in the inning hit home runs, or to have an inning begin with 27 straight hits. So we will weed out any sequence with less than a one in ten million chance of occurring.
Completing this analysis for the inning yields the following results:
RUNS 1 2 3 4 5 6 7 8 9 .6046 1.4 0.4 0.2 32.8 26.9 18.9 11.1 5.7 2.6
So this lineup would be expected to score .6046 runs an inning and have the cleanup hitter leading off the next inning 32.8% of the time, the 5th place hitter leading off 26.9% of the time, and so on. Notice that earlier we had determined the probability of a 1-2-3 inning with the cleanup hitter due up first in the following inning was 27.16%. This doesn't match the 32.8% of time that the fourth hitter leads off the next frame because the 1-2-3 innings do not include cases where a batter reaches base but is removed as part of a double-play.
So in order to analyze a batting order, we need to do this analysis for each of the nine batting order positions leading off an inning. Here are the results for the National League:
ST RUNS 1 2 3 4 5 6 7 8 9 1 .6046 1.4 0.4 0.2 32.8 26.9 18.9 11.1 5.7 2.6 2 .6107 3.1 1.0 0.4 0.2 31.8 27.7 19.4 11.1 5.3 3 .5806 6.3 2.1 0.9 0.4 0.2 32.2 28.7 19.0 10.2 4 .5128 11.9 4.3 1.9 0.8 0.4 0.2 34.5 28.5 17.6 5 .4518 20.7 8.5 4.0 1.6 0.8 0.4 0.2 36.6 27.3 6 .4146 31.4 16.3 8.2 3.5 1.7 0.8 0.4 0.2 37.5 7 .4110 40.4 28.9 16.2 7.4 3.8 1.9 0.9 0.4 0.1 8 .4499 0.2 38.6 29.3 15.5 8.6 4.4 2.1 0.9 0.4 9 .5148 0.5 0.1 37.8 25.8 17.3 9.9 5.1 2.4 1.0 Where: ST - the lead-off batter in the inning
Hopefully, this chart makes sense. The expected runs is actually highest with the second batter leading off and lowest when the bottom of the order (7-8-9) is due up. Notice that the highest probability in the chart is the 40.4% chance that the first batter in the lineup will be up first in the innings after the 7th-place batter leads off.
The chart for the American League:
ST RUNS 1 2 3 4 5 6 7 8 9 1 .6104 1.2 0.5 0.2 32.9 27.0 18.8 11.0 5.7 2.8 2 .6109 2.6 1.1 0.5 0.2 32.1 27.8 19.1 10.9 5.7 3 .5932 5.3 2.3 1.1 0.5 0.2 32.8 28.2 18.8 10.8 4 .5468 10.0 4.5 2.2 1.0 0.5 0.2 34.1 28.8 18.7 5 .5198 18.4 9.0 4.6 2.1 1.0 0.5 0.2 35.1 29.2 6 .5041 29.4 16.7 9.2 4.3 2.2 1.1 0.5 0.2 36.5 7 .4994 37.6 27.5 17.4 8.8 4.6 2.3 1.1 0.5 0.2 8 .5316 0.2 35.4 28.5 17.1 9.6 5.1 2.5 1.1 0.5 9 .5766 0.5 0.2 34.6 26.9 17.8 10.7 5.5 2.6 1.2
How does this match up with what we see in real life? Here are the real life charts for the National League:
ST RUNS 1 2 3 4 5 6 7 8 9 1 .6115 1.3 0.4 0.2 33.3 27.3 18.8 10.9 5.3 2.4 2 .6195 3.1 1.1 0.5 0.3 31.9 27.8 19.4 10.8 5.1 3 .5861 6.4 2.1 0.9 0.4 0.3 32.6 28.5 18.9 9.9 4 .5067 12.0 3.9 1.8 0.7 0.3 0.1 34.9 28.8 17.4 5 .4607 20.6 8.5 4.0 1.5 0.8 0.4 0.2 37.1 27.0 6 .4158 31.7 16.1 7.9 3.5 1.5 0.8 0.4 0.2 37.9 7 .4218 40.9 28.8 16.5 7.1 3.6 1.9 0.8 0.4 0.2 8 .4610 0.2 39.1 29.5 15.3 8.5 4.1 2.0 1.0 0.4 9 .5161 0.5 0.2 38.4 25.9 17.2 9.6 4.8 2.3 1.0
And the AL:
ST RUNS 1 2 3 4 5 6 7 8 9 1 .6170 1.2 0.5 0.2 33.3 27.4 18.6 10.9 5.2 2.7 2 .6198 2.4 1.0 0.5 0.3 32.2 28.1 19.1 10.8 5.7 3 .6011 5.3 2.2 1.1 0.5 0.3 33.0 28.4 19.1 10.2 4 .5527 9.9 4.3 2.0 0.9 0.5 0.3 34.2 29.0 18.8 5 .5294 18.2 8.6 4.5 2.0 1.0 0.6 0.2 35.5 29.4 6 .5070 29.6 16.7 8.9 4.0 2.0 1.0 0.5 0.3 37.0 7 .4963 38.2 28.1 17.2 8.3 4.4 2.1 1.0 0.5 0.3 8 .5248 0.2 35.8 29.1 16.9 9.1 5.0 2.4 1.0 0.5 9 .5845 0.6 0.3 35.1 27.0 17.6 10.3 5.3 2.6 1.2
Both leagues slightly out-performed the model. National League teams scored 1.06% more runs than predicted (4.5992 to 4.5508) while the American League bettered the model by 0.8% (5.0326 to 4.9928).
Before going much further, we should point out two problems with the model as well as two (possibly faulty) assumptions. The first problem is that we are not including stolen bases, caught stealing, wild pitches, passed balls and so on. My assumption is that the benefit or cost of these events are not overly dependent upon batting order. In other words, batting orders do not materially affect the number of these events. Now there are some lineups where this would not be true. For example, if we bat our best base-stealer eighth in the NL, this could decrease his stolen base attempts as managers opt to sacrifice the runner over rather than risk a caught stealing.
The second problem is that the transitions also contain information about how the runners on ahead of each batter did during his at-bats. Since a third-place hitter generally has faster runners on-base during his at-bats than a leadoff hitter, it is not realistic to assume that a batter's transition matrix will be the same if we moved him in the batter order. One way of dealing with this is to credit each batter with a generic state of transitions based upon what he did in each situations. In other words, when he hits a single with a man on first and one out, we won't credit him with what actually happened on the play (since the chance of the runner reaching second or third is at least partly due to the speed and the runner), but we will credit him with what typically happens in these situations. There are problems with this approach, however, since you'd almost have to take into account hit locations (since singles to right produce more first-to-third transitions than singles to left) and that data is not available for all of our hits. We will live with the problem for now, but this is something we'll need to keep in mind.
As for the assumptions, the first is that we are assuming batters don't care where they hit in the lineup, that if we took (for example) Barry Bonds and hit him leadoff, he would hit as well as he did hitting cleanup, and wouldn't go into a sulk and have his performance suffer as a result. Baseball players are not simply numbers in a transition matrix and a theoretically great batting order is not going to work if it causes a player revolt. This is probably more of a problem when proposing novel relief pitching strategies, but it's also a potential problem here.
And finally, we are ignoring "batter protection." In other words, when Barry Bonds is up in a particular situation, our model doesn't care who is in the on-deck circle. In general, this might make sense, but they are cases (in particular when dealing in the NL with the batter hitting in front of the pitcher) when this assumption will clearly not be correct.
Despite these problems and assumptions, I think this method could give us some insight into some excellent as well as some poor lineup choices, as well as information on how beneficial or costly some of these choices are.
So for each of our traditional lineups, here are the predicted runs scored in the first eight innings. First the NL:
INN RUNS 1 2 3 4 5 6 7 8 9 1 .605 1.4 0.4 0.2 32.8 26.9 18.9 11.1 5.7 2.6 2 .465 20.0 12.2 7.7 4.2 2.5 1.5 12.1 19.5 20.3 3 .521 7.7 12.1 15.8 15.9 15.0 12.8 9.8 6.8 4.2 4 .507 14.5 10.0 7.3 6.3 8.0 11.0 13.7 15.0 14.2 5 .502 12.5 12.8 13.5 12.4 11.6 10.3 9.0 8.6 9.3 6 .514 12.3 9.6 9.2 9.2 10.5 11.9 12.8 12.8 11.7 7 .501 13.3 12.2 12.0 10.7 10.3 10.1 10.1 10.4 10.9 8 .512 12.1 10.4 10.4 10.2 11.0 11.6 11.9 11.6 10.8 TOT 4.127
The values under 1 through 9 are the percentages of times that that lineup spot led off the next inning.
And now the AL:
INN RUNS 1 2 3 4 5 6 7 8 9 1 .610 1.2 0.5 0.2 32.9 27.0 18.8 11.0 5.7 2.8 2 .527 18.0 12.1 8.2 4.8 2.9 1.7 12.1 19.3 21.1 3 .566 7.2 11.4 15.5 16.2 15.0 13.0 10.0 7.1 4.8 4 .552 13.2 10.0 7.9 6.8 8.1 10.9 13.5 14.7 14.8 5 .554 11.4 12.2 13.3 12.8 11.8 10.7 9.3 8.8 9.7 6 .557 11.3 9.6 9.5 9.6 10.4 11.8 12.6 12.7 12.4 7 .552 12.1 11.7 12.0 11.2 10.6 10.4 10.3 10.4 11.3 8 .557 11.2 10.2 10.6 10.5 10.9 11.6 11.9 11.6 11.5 TOT 4.477
As I mentioned in the introduction, I wanted to look at all possible lineups for this study. My feeling was that, since all the likely combinations have been tried in simulation, a lineup that was significantly better than our traditional one would almost have to be one we wouldn't ordinarily consider.
So I ran the tests on the lineups mentioned above for the NL and AL from 1993 to 2004. Let's look at the NL first. As you will recall from the previous section, the traditional lineup, when evaluated using our method, produced 4.127 runs over 8 innings. When I looked at all possible combinations, the lineups went from a low of 3.967 runs to a high of 4.143. So the range was rather small. Here are how some of our likely and unlikely candidates ranked:
RUNS ----- LINEUP ---- RANK 4.127 1 2 3 4 5 6 7 8 9 216 the traditional one 4.125 3 4 1 5 2 6 8 7 9 324 sorted by OBP, highest to lowest 4.109 3 4 5 6 2 1 7 8 9 4757 sorted by OPS, highest to lowest 3.999 9 8 7 6 5 4 3 2 1 339298 reverse traditional 3.983 9 7 8 6 2 5 1 4 3 360764 sorted by OBP, lowest to highest 3.984 9 8 7 1 2 6 5 4 3 360269 sorted by OPS, lowest to highest
Nothing too surprising here. The most obviously good ones are pretty near the top and the perversely bad ones are near the bottom. So what were the best and worst lineups in the National League? Here are the 10 best:
RUNS ----- LINEUP ---- 4.143 1 3 2 5 4 6 7 9 8 4.142 1 3 2 5 4 6 8 7 9 4.142 1 3 4 2 5 6 7 9 8 4.142 1 3 4 5 2 6 7 9 8 4.140 1 3 4 2 5 6 8 7 9 4.140 1 3 4 5 2 6 8 7 9 4.139 1 3 2 5 4 6 7 8 9 4.138 1 2 4 3 5 6 8 7 9 4.138 1 2 5 3 4 6 8 7 9 4.138 1 3 4 5 6 2 7 9 8
Notice that these lineups are very similar to each other, and very similar to the typical ordering. The top one has no hitter more than one position removed from his "normal" place.
The ten worst:
RUNS ----- LINEUP ---- 3.967 8 9 2 6 1 7 5 4 3 3.967 7 8 1 9 6 5 2 4 3 3.967 7 8 1 9 2 5 6 4 3 3.967 2 8 1 9 6 7 5 4 3 3.968 8 9 2 6 1 7 4 5 3 3.968 8 9 2 1 7 5 6 4 3 3.968 8 9 2 1 6 7 5 4 3 3.968 8 7 1 9 6 5 2 4 3 3.968 8 7 1 9 2 6 5 4 3 3.968 8 7 1 9 2 5 6 4 3
What about the AL? Let's start once more with some likely candidates:
RUNS ----- LINEUP ---- RANK 4.477 1 2 3 4 5 6 7 8 9 516 the traditional one 4.466 3 4 5 1 2 6 7 8 9 8653 sorted by OBP, highest to lowest 4.476 4 3 5 6 2 1 7 8 9 730 sorted by OPS, highest to lowest 4.408 9 8 7 6 5 4 3 2 1 340055 reverse traditional 4.398 9 8 7 6 2 1 5 4 3 359066 sorted by OBP, lowest to highest 4.390 9 8 7 1 2 6 5 3 4 362546 sorted by OPS, lowest to highest
Very similar results here. The best lineups:
RUNS ----- LINEUP ---- 4.488 5 2 4 3 1 6 7 8 9 4.488 5 2 4 3 1 7 6 8 9 4.487 5 2 4 3 1 6 8 7 9 4.486 1 2 4 3 5 7 6 8 9 4.486 5 1 4 3 6 2 7 8 9 4.485 1 2 4 3 5 6 7 8 9 4.485 1 2 4 3 5 6 8 7 9 4.485 5 1 4 3 2 7 6 8 9 4.485 5 1 4 3 6 2 8 7 9 4.485 5 1 4 3 6 7 2 8 9
There are some weird ones here, but also some that are extremely similar to the traditional one.
And the worst:
RUNS ----- LINEUP ---- 4.380 2 9 1 7 8 6 5 4 3 4.380 2 9 1 7 8 6 5 3 4 4.381 9 8 1 7 2 6 5 4 3 4.381 2 9 1 7 8 6 3 5 4 4.381 2 9 1 7 8 5 6 4 3 4.381 2 9 1 7 8 5 6 3 4 4.382 9 8 1 7 2 6 5 3 4 4.382 2 9 1 7 8 6 4 5 3 4.382 2 8 1 9 7 6 5 4 3 4.382 2 8 1 7 9 6 5 4 3
Note that most of the "worst" lineups feature hitters 3 through 5 at the bottom of the order. This makes sense since they are the best hitters.
I must confess that this wasn't what I hoped to find when I began this investigation. I suppose the best-case scenario would have been to find a handful of counter-intuitive lineups that were significantly better than the traditional ones. As it was, the best lineup in the NL scored only 4.4% more runs than the worst, and in the AL the range was even narrower, as the best team scored only 2.4% more runs than the worst. And the difference between the best and the traditional lineup is negligible: in the NL it amounted to 0.38% more runs (or about 3 runs a season) and in the AL it was 0.24% more runs. These results seem to agree with the long-held belief that the ordering makes little difference.
In addition to the traditional model, one of the things I want to look at was the impact of right and left-handed pitchers on our lineup. Let's start by taking another look at the composite lineups in the NL and AL from 1993 to 2004, this time broken down by the handedness of the hitters.
NL right-handed hitters:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 46840 12652 2407 373 955 4319 122 7278 645 390 259 3167 1240 .338 .399 2 64394 17773 3393 429 1630 5560 72 10458 752 1010 437 1616 584 .339 .418 3 50931 14881 2873 240 2565 6241 593 9282 603 81 539 1460 549 .373 .509 4 64521 18433 3627 282 3245 6801 734 12377 751 38 691 1008 465 .357 .502 5 59166 16199 3214 289 2441 5334 379 11043 653 168 571 1128 588 .338 .462 6 65773 17499 3492 360 2369 5550 363 12922 745 254 623 1205 643 .327 .438 7 72331 18420 3644 361 2030 5813 384 14297 756 451 626 866 520 .314 .399 8 68274 16978 3309 367 1411 6314 1414 13121 734 674 554 577 294 .317 .370 9 60365 10101 1765 124 574 3268 89 20974 343 5429 284 244 123 .213 .229
NL left-handed hitters:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 42937 11891 1997 517 688 4147 161 6205 394 349 232 1305 527 .344 .396 2 26394 7212 1340 245 554 2536 47 4245 199 288 148 960 419 .340 .406 3 46306 13773 2903 306 2161 6781 772 7648 475 70 510 706 249 .389 .513 4 32010 9166 1974 156 1606 4796 735 6234 301 23 351 507 231 .381 .508 5 37247 10185 2174 218 1516 4567 491 7241 239 63 309 377 206 .354 .466 6 28371 7601 1470 155 974 2991 371 5578 205 97 229 279 154 .340 .434 7 19773 5022 917 145 501 1993 297 3992 151 118 144 226 102 .325 .391 8 14814 3714 698 112 292 1734 406 2858 129 120 150 310 147 .331 .372 9 26402 5224 958 85 367 2143 130 7631 102 1651 149 150 93 .259 .282
NL switch-hitters:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 30557 8284 1427 299 432 3151 80 4917 240 279 171 1059 443 .342 .380 2 26644 7327 1256 223 421 2467 38 3865 178 524 165 344 148 .339 .386 3 14074 4033 780 59 527 1711 145 2221 64 42 139 319 104 .363 .463 4 13302 3723 754 55 623 1882 224 2615 97 9 145 199 107 .370 .485 5 12747 3460 723 51 422 1438 154 2454 111 31 136 175 124 .347 .435 6 13008 3467 699 85 371 1451 143 2522 123 70 124 199 120 .343 .419 7 12581 3117 639 77 295 1328 110 2430 126 93 111 152 91 .323 .381 8 17192 4352 765 142 245 1994 397 3025 137 163 138 147 68 .333 .357 9 8437 1862 353 38 155 865 43 2178 63 232 56 110 57 .296 .327
AL right-handed hitters:
AL AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 39789 10867 2082 282 862 3864 86 6278 560 349 270 2312 800 .344 .405 2 49623 13668 2646 287 1254 4420 68 8086 583 661 407 1536 525 .339 .416 3 46666 13550 2658 203 2187 5480 475 8209 529 72 564 1064 336 .367 .497 4 57844 16456 3360 184 3046 6923 657 11089 660 32 677 493 259 .364 .507 5 46617 12607 2570 160 1905 4601 285 9057 501 86 475 550 339 .339 .455 6 52233 13855 2814 203 1909 4618 248 10343 574 188 428 698 411 .329 .437 7 52517 13526 2676 218 1636 4342 200 10095 604 323 479 775 464 .319 .410 8 59530 15058 2993 246 1477 4397 133 11072 665 681 496 676 431 .309 .386 9 61231 15382 2997 318 1108 4195 32 11012 637 1163 494 675 369 .304 .365
AL left-handed hitters:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 38944 11040 1892 407 817 3761 219 5426 351 314 294 1163 481 .350 .416 2 25900 7264 1408 193 675 2659 84 3962 215 237 203 1034 407 .350 .428 3 43843 12696 2552 180 1936 5738 561 7363 438 54 507 604 230 .374 .488 4 31711 8903 1849 93 1588 4281 561 5889 293 16 313 367 170 .368 .495 5 40219 11143 2377 151 1708 4898 564 7058 300 57 394 356 172 .357 .471 6 32035 8760 1755 157 1176 3450 345 5995 252 102 251 273 172 .346 .448 7 27065 7249 1348 175 853 2822 301 5015 243 133 244 218 122 .340 .425 8 19319 5083 1020 96 525 1849 184 3786 159 143 163 227 138 .330 .407 9 13986 3561 654 97 238 1131 47 2368 118 207 125 472 285 .313 .366
AL switch-hitters:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 32319 8801 1588 329 553 3585 121 5225 191 328 228 1323 495 .346 .393 2 32863 8954 1631 282 609 3382 51 5094 221 549 314 388 155 .341 .395 3 13617 3955 748 112 446 1463 111 2227 105 65 171 188 85 .360 .460 4 12098 3437 671 52 522 1531 201 2315 82 12 129 120 77 .365 .478 5 13785 3764 736 61 512 1677 166 2693 123 29 118 167 105 .354 .447 6 14864 3946 758 80 522 1781 166 2982 121 68 113 213 115 .346 .433 7 17109 4344 915 90 455 1840 129 3268 133 106 162 232 131 .328 .398 8 15692 4016 766 94 356 1632 96 2899 115 158 136 245 183 .328 .385 9 16709 4155 747 124 234 1483 27 3125 133 312 134 481 271 .313 .350
The pattern in both leagues is similar. Lefties appear most frequently in the first, third and fifth slots, while switch-hitters are most likely to bat first or second. One of the reasons for alternating righty and lefty hitters is to minimize the effectiveness of short relievers in late innings, since these pitchers will often lose the platoon advantage after one batter.
What I would like to model is the traditional lineup in both leagues, with lefties batting first, third and fifth and right-handed hitters in the other slots. I want to look at the best and worst lineups against both right and left handed pitching. But before I do that here are a bunch of more charts
NL right-handed hitter vs right-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 33460 8980 1671 280 617 2916 52 5261 513 298 192 2364 845 .335 .390 2 45990 12527 2338 300 1106 3704 24 7667 633 753 312 1288 412 .333 .408 3 37770 10962 2118 174 1910 4326 365 7078 501 59 395 1126 391 .367 .507 4 46943 13231 2556 196 2300 4487 363 9315 636 31 519 751 322 .349 .492 5 41441 11210 2216 205 1688 3442 174 8023 558 144 404 866 410 .332 .456 6 47974 12630 2473 263 1674 3706 206 9704 626 204 472 944 483 .321 .430 7 54249 13622 2630 255 1497 4102 222 10976 642 359 483 708 389 .309 .392 8 51283 12607 2383 284 1027 4498 997 10056 627 541 400 449 218 .312 .363 9 43936 7180 1234 82 371 2140 19 15367 249 4021 178 205 102 .206 .221
NL right-handed hitter vs left-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 13380 3672 736 93 338 1403 70 2017 132 92 67 803 395 .348 .419 2 18404 5246 1055 129 524 1856 48 2791 119 257 125 328 172 .352 .442 3 13161 3919 755 66 655 1915 228 2204 102 22 144 334 158 .387 .514 4 17578 5202 1071 86 945 2314 371 3062 115 7 172 257 143 .378 .528 5 17725 4989 998 84 753 1892 205 3020 95 24 167 262 178 .351 .475 6 17799 4869 1019 97 695 1844 157 3218 119 50 151 261 160 .343 .459 7 18082 4798 1014 106 533 1711 162 3321 114 92 143 158 131 .330 .422 8 16991 4371 926 83 384 1816 417 3065 107 133 154 128 76 .330 .389 9 16429 2921 531 42 203 1128 70 5607 94 1408 106 39 21 .233 .252
NL left-handed hitter vs right-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 33686 9398 1622 419 587 3304 154 4754 243 235 171 1027 425 .346 .404 2 21158 5873 1107 199 470 2048 45 3301 134 200 118 742 322 .343 .415 3 33467 10152 2175 234 1673 5227 703 5296 263 36 355 563 185 .398 .532 4 24415 7147 1594 118 1283 3833 674 4509 183 10 268 401 181 .389 .525 5 30041 8349 1820 176 1267 3732 474 5606 153 35 240 308 157 .358 .477 6 22945 6293 1224 133 828 2501 369 4336 136 68 175 227 124 .347 .447 7 16077 4116 759 116 427 1714 290 3131 106 71 112 186 81 .330 .397 8 12096 3102 585 89 258 1510 372 2282 97 92 131 260 126 .340 .384 9 21506 4419 840 76 330 1843 127 5954 82 1226 125 104 72 .269 .298
NL left-handed hitter vs left-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 9251 2493 375 98 101 843 7 1451 151 114 61 278 102 .338 .364 2 5236 1339 233 46 84 488 2 944 65 88 30 218 97 .325 .366 3 12839 3621 728 72 488 1554 69 2352 212 34 155 143 64 .365 .464 4 7595 2019 380 38 323 963 61 1725 118 13 83 106 50 .354 .453 5 7206 1836 354 42 249 835 17 1635 86 28 69 69 49 .336 .419 6 5426 1308 246 22 146 490 2 1242 69 29 54 52 30 .309 .375 7 3696 906 158 29 74 279 7 861 45 47 32 40 21 .304 .364 8 2718 612 113 23 34 224 34 576 32 28 19 50 21 .290 .321 9 4896 805 118 9 37 300 3 1677 20 425 24 46 21 .215 .215
NL switch-hitters vs right-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 22316 6018 980 225 292 2369 67 3650 184 203 136 850 333 .343 .373 2 20040 5497 927 182 309 1859 30 2872 151 367 122 257 108 .339 .385 3 10255 2963 575 44 408 1300 119 1608 49 31 103 242 72 .368 .473 4 9854 2754 575 36 481 1438 164 1999 69 5 112 151 68 .371 .492 5 9618 2599 544 40 317 1138 128 1882 75 19 120 146 90 .348 .434 6 9938 2661 528 74 290 1138 118 1924 95 39 88 169 90 .346 .423 7 9460 2368 487 64 223 1018 89 1816 97 69 78 119 67 .327 .386 8 13097 3331 572 121 184 1536 318 2310 101 120 115 115 45 .335 .359 9 6461 1429 279 30 125 690 35 1662 52 159 45 86 42 .300 .332
NL switch-hitters vs left-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 8241 2266 447 74 140 782 13 1267 56 76 35 209 110 .341 .398 2 6604 1830 329 41 112 608 8 993 27 157 43 87 40 .339 .390 3 3819 1070 205 15 119 411 26 613 15 11 36 77 32 .349 .435 4 3448 969 179 19 142 444 60 616 28 4 33 48 39 .365 .468 5 3129 861 179 11 105 300 26 572 36 12 16 29 34 .344 .440 6 3070 806 171 11 81 313 25 598 28 31 36 30 30 .333 .405 7 3121 749 152 13 72 310 21 614 29 24 33 33 24 .311 .366 8 4095 1021 193 21 61 458 79 715 36 43 23 32 23 .328 .351 9 1976 433 74 8 30 175 8 516 11 73 11 24 15 .285 .310
AL right-handed hitter vs right-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 27339 7441 1371 198 567 2594 48 4382 441 273 204 1711 546 .343 .399 2 34110 9289 1781 195 831 2879 17 5776 473 465 293 1150 348 .335 .409 3 33523 9694 1872 145 1520 3658 255 5998 441 55 414 796 218 .363 .490 4 41605 11606 2372 143 2146 4566 299 8180 543 27 482 342 166 .354 .498 5 31383 8383 1682 107 1240 2852 99 6223 395 61 341 419 241 .333 .446 6 36552 9555 1932 142 1322 3057 102 7475 470 142 318 509 283 .324 .431 7 37251 9409 1817 159 1114 2871 90 7389 501 250 358 591 322 .312 .400 8 43071 10736 2113 165 1024 3069 67 8138 555 537 350 534 298 .305 .377 9 44837 11168 2146 220 782 2945 11 8285 535 916 346 508 240 .301 .359
AL right-handed hitter vs left-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 12450 3426 711 84 295 1270 38 1896 119 76 66 601 254 .346 .417 2 15513 4379 865 92 423 1541 51 2310 110 196 114 386 177 .349 .432 3 13143 3856 786 58 667 1822 220 2211 88 17 150 268 118 .379 .514 4 16239 4850 988 41 900 2357 358 2909 117 5 195 151 93 .387 .531 5 15234 4224 888 53 665 1749 186 2834 106 25 134 131 98 .353 .473 6 15681 4300 882 61 587 1561 146 2868 104 46 110 189 128 .342 .451 7 15266 4117 859 59 522 1471 110 2706 103 73 121 184 142 .336 .436 8 16459 4322 880 81 453 1328 66 2934 110 144 146 142 133 .319 .408 9 16394 4214 851 98 326 1250 21 2727 102 247 148 167 129 .311 .381
AL left-handed hitter vs right-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 29762 8553 1493 338 682 2889 205 3885 242 195 214 925 343 .353 .429 2 20295 5817 1153 158 578 2143 81 2952 140 154 155 774 293 .356 .444 3 31387 9343 1932 146 1494 4398 539 4890 229 25 353 464 168 .384 .511 4 23468 6711 1384 75 1248 3392 528 4116 183 4 228 275 126 .377 .511 5 31119 8740 1916 124 1388 4032 551 5151 180 30 293 280 121 .364 .484 6 24744 6842 1380 128 957 2758 336 4392 142 57 186 207 129 .350 .459 7 21333 5827 1089 144 702 2302 297 3786 149 77 193 164 94 .345 .436 8 15447 4145 851 77 457 1504 184 2877 98 94 126 165 99 .335 .422 9 11090 2864 545 81 196 932 44 1796 78 133 106 342 188 .317 .375
AL left-handed hitter vs left-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 9182 2487 399 69 135 872 14 1541 109 119 80 238 138 .339 .373 2 5605 1447 255 35 97 516 3 1010 75 83 48 260 114 .326 .368 3 12456 3353 620 34 442 1340 22 2473 209 29 154 140 62 .346 .431 4 8243 2192 465 18 340 889 33 1773 110 12 85 92 44 .342 .450 5 9100 2403 461 27 320 866 13 1907 120 27 101 76 51 .333 .426 6 7291 1918 375 29 219 692 9 1603 110 45 65 66 43 .333 .413 7 5732 1422 259 31 151 520 4 1229 94 56 51 54 28 .318 .383 8 3872 938 169 19 68 345 0 909 61 49 37 62 39 .311 .348 9 2896 697 109 16 42 199 3 572 40 74 19 130 97 .297 .333
AL switch-hitters vs right-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 23600 6444 1146 240 409 2672 97 3896 135 226 173 1027 375 .348 .394 2 23964 6611 1161 223 460 2578 40 3679 158 370 223 277 104 .347 .401 3 9672 2816 541 89 319 1036 78 1542 70 40 135 139 63 .359 .464 4 8519 2406 470 43 359 1107 131 1634 54 6 85 94 54 .365 .474 5 9900 2708 531 47 393 1231 118 1970 89 21 86 122 77 .356 .456 6 10925 2937 543 69 410 1333 123 2248 97 50 86 174 87 .351 .444 7 12622 3202 681 67 350 1421 111 2431 102 55 120 186 93 .331 .401 8 11704 2970 565 80 259 1291 78 2185 91 111 101 187 118 .330 .382 9 12389 3089 545 107 172 1126 22 2376 102 214 97 357 194 .315 .352
AL switch-hitters vs left-handed pitchers:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 8719 2357 442 89 144 913 24 1329 56 102 55 296 120 .341 .391 2 8899 2343 470 59 149 804 11 1415 63 179 91 111 51 .326 .380 3 3945 1139 207 23 127 427 33 685 35 25 36 49 22 .360 .449 4 3579 1031 201 9 163 424 70 681 28 6 44 26 23 .364 .486 5 3885 1056 205 14 119 446 48 723 34 8 32 45 28 .349 .424 6 3939 1009 215 11 112 448 43 734 24 18 27 39 28 .334 .402 7 4487 1142 234 23 105 419 18 837 31 51 42 46 38 .320 .387 8 3988 1046 201 14 97 341 18 714 24 47 35 58 65 .322 .393 9 4320 1066 202 17 62 357 5 749 31 98 37 124 77 .306 .344
The switch-hitting charts aren't really needed for what I'll be doing but I wanted to include them for completeness.
So given the handedness of my lineup (first, third and fifth lefties and the rest righties), here's how my lineup will hit against right and left-handed pitchers.
NL versus righties:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 33686 9398 1622 419 587 3304 154 4754 243 235 171 1027 425 .346 .404 2 45990 12527 2338 300 1106 3704 24 7667 633 753 312 1288 412 .333 .408 3 33467 10152 2175 234 1673 5227 703 5296 263 36 355 563 185 .398 .532 4 46943 13231 2556 196 2300 4487 363 9315 636 31 519 751 322 .349 .492 5 30041 8349 1820 176 1267 3732 474 5606 153 35 240 308 157 .358 .477 6 47974 12630 2473 263 1674 3706 206 9704 626 204 472 944 483 .321 .430 7 54249 13622 2630 255 1497 4102 222 10976 642 359 483 708 389 .309 .392 8 51283 12607 2383 284 1027 4498 997 10056 627 541 400 449 218 .312 .363 9 43936 7180 1234 82 371 2140 19 15367 249 4021 178 205 102 .206 .221
NL versus lefties:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 9251 2493 375 98 101 843 7 1451 151 114 61 278 102 .338 .364 2 18404 5246 1055 129 524 1856 48 2791 119 257 125 328 172 .352 .442 3 12839 3621 728 72 488 1554 69 2352 212 34 155 143 64 .365 .464 4 17578 5202 1071 86 945 2314 371 3062 115 7 172 257 143 .378 .528 5 7206 1836 354 42 249 835 17 1635 86 28 69 69 49 .336 .419 6 17799 4869 1019 97 695 1844 157 3218 119 50 151 261 160 .343 .459 7 18082 4798 1014 106 533 1711 162 3321 114 92 143 158 131 .330 .422 8 16991 4371 926 83 384 1816 417 3065 107 133 154 128 76 .330 .389 9 16429 2921 531 42 203 1128 70 5607 94 1408 106 39 21 .233 .252
AL versus righties:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 29762 8553 1493 338 682 2889 205 3885 242 195 214 925 343 .353 .429 2 34110 9289 1781 195 831 2879 17 5776 473 465 293 1150 348 .335 .409 3 31387 9343 1932 146 1494 4398 539 4890 229 25 353 464 168 .384 .511 4 41605 11606 2372 143 2146 4566 299 8180 543 27 482 342 166 .354 .498 5 31119 8740 1916 124 1388 4032 551 5151 180 30 293 280 121 .364 .484 6 36552 9555 1932 142 1322 3057 102 7475 470 142 318 509 283 .324 .431 7 37251 9409 1817 159 1114 2871 90 7389 501 250 358 591 322 .312 .400 8 43071 10736 2113 165 1024 3069 67 8138 555 537 350 534 298 .305 .377 9 44837 11168 2146 220 782 2945 11 8285 535 916 346 508 240 .301 .359
AL versus lefties:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG 1 9182 2487 399 69 135 872 14 1541 109 119 80 238 138 .339 .373 2 15513 4379 865 92 423 1541 51 2310 110 196 114 386 177 .349 .432 3 12456 3353 620 34 442 1340 22 2473 209 29 154 140 62 .346 .431 4 16239 4850 988 41 900 2357 358 2909 117 5 195 151 93 .387 .531 5 9100 2403 461 27 320 866 13 1907 120 27 101 76 51 .333 .426 6 15681 4300 882 61 587 1561 146 2868 104 46 110 189 128 .342 .451 7 15266 4117 859 59 522 1471 110 2706 103 73 121 184 142 .336 .436 8 16459 4322 880 81 453 1328 66 2934 110 144 146 142 133 .319 .408 9 16394 4214 851 98 326 1250 21 2727 102 247 148 167 129 .311 .381
Now, my model is designed to predict scoring over an eight-inning stretch and, while it won't give us much insight into how each lineup will perform against a parade of short relievers in the late innings of a game, it could give us an idea whether or not such a lineup is designed to work well during the earlier innings of a game.
The best lineups against right-handed pitchers in the National League (I've added the traditional one at the end):
RUNS ----- LINEUP ---- V/L 4.103 1 3 5 4 2 6 7 9 8 4.198 4.100 1 3 5 2 4 6 7 9 8 4.228 4.099 1 3 5 4 2 6 8 7 9 4.194 4.098 1 3 5 4 2 6 9 8 7 4.193 4.097 1 2 5 3 4 6 7 9 8 4.228 4.097 1 3 5 4 2 7 6 9 8 4.197 4.096 1 2 5 3 4 6 8 7 9 4.227 4.096 1 3 4 5 2 6 7 9 8 4.205 4.096 2 3 1 5 4 6 7 9 8 4.210 4.095 1 3 2 5 4 6 7 9 8 4.230 4.077 1 2 3 4 5 6 7 8 9 4.195 Where: V/L - how the same lineup did against left-handed pitchers.
Notice the the very best lineups bunch all the lefties at the top of the order. Note also that all of these lineups actually hit lefties better than righties. That's because we have more right-handed hitters in the lineup.
The worst in the NL:
RUNS ----- LINEUP ---- 3.861 8 7 2 1 9 5 6 4 3 3.863 8 7 2 1 9 5 4 6 3 3.863 8 7 2 1 9 4 5 6 3 3.863 2 8 7 1 9 5 6 4 3 3.864 8 9 5 1 7 2 6 4 3 3.865 8 9 5 1 7 6 2 4 3 3.865 8 9 4 1 7 2 6 5 3 3.865 8 7 2 1 9 6 5 4 3 3.865 8 7 2 1 9 4 6 5 3 3.865 2 9 4 1 8 7 6 5 3
The best in the AL (again, with the traditional one at the end):
RUNS ----- LINEUP ---- V/L 4.463 4 3 2 5 1 6 8 7 9 4.506 4.462 4 3 2 5 1 6 7 8 9 4.505 4.461 4 1 2 5 3 6 7 8 9 4.516 4.461 4 1 2 5 3 6 8 7 9 4.517 4.461 4 2 5 3 1 6 8 7 9 4.503 4.460 4 2 5 1 3 6 7 8 9 4.512 4.460 4 2 5 1 3 6 8 7 9 4.514 4.460 4 2 5 3 1 6 7 8 9 4.501 4.460 4 3 2 5 1 6 7 9 8 4.503 4.460 4 3 2 5 1 6 8 9 7 4.505 4.441 1 2 3 4 5 6 7 8 9 4.499
I found it strange that all of these lineups had the cleanup hitter leading off and several of them bunched the lefties together, but this time in the middle of the order.
And the worst in the AL:
RUNS ----- LINEUP ---- 4.323 9 8 1 7 6 5 4 2 3 4.323 9 8 1 7 6 2 3 4 5 4.323 9 8 1 6 2 7 5 4 3 4.323 8 9 1 7 6 5 4 2 3 4.323 6 9 1 7 8 5 4 2 3 4.323 2 9 7 6 8 1 4 3 5 4.324 9 8 1 7 6 4 2 3 5 4.324 9 8 1 7 6 2 5 4 3 4.324 9 8 1 7 6 2 4 3 5 4.324 9 8 1 6 2 7 4 3 5
There is more of a difference between the very best and worst lineups this time around, but these values are still in a rather narrow band, and the best lineup against right-handed pitchers is still not noticeably better than the traditional one.
The best lineups in the NL against lefties (with the traditional one at the end):
RUNS ----- LINEUP ---- V/R 4.252 8 4 6 2 3 5 1 7 9 4.037 4.251 6 1 2 3 4 5 8 7 9 4.062 4.251 6 1 2 3 4 5 8 9 7 4.050 4.251 7 1 6 3 4 2 5 8 9 4.037 4.251 8 4 6 1 3 2 5 7 9 4.030 4.249 6 1 2 3 4 8 5 9 7 4.038 4.249 6 1 3 2 4 5 8 9 7 4.039 4.249 7 1 6 2 3 4 5 8 9 4.040 4.248 6 1 3 2 4 5 8 7 9 4.051 4.248 7 1 2 3 4 6 5 8 9 4.052 4.195 1 2 3 4 5 6 7 8 9 4.077 Where: V/R - how the same lineup did against right-handed pitchers.
Notice how the best lineup bunches all the lefty hitters together toward the bottom of the order, although this is not true of the others on the list.
The worst:
RUNS ----- LINEUP ---- 4.016 5 8 1 9 2 3 7 4 6 4.019 5 8 1 9 2 3 7 6 4 4.020 5 8 1 9 2 7 4 6 3 4.024 5 8 1 9 2 7 3 6 4 4.025 5 8 1 9 2 7 4 3 6 4.025 5 8 1 9 2 6 7 4 3 4.026 5 8 1 9 2 3 4 7 6 4.027 3 8 1 9 2 7 4 6 5 4.028 5 8 1 9 2 7 3 4 6 4.028 5 8 1 9 2 3 4 6 7
And the same for the AL. The best:
RUNS ----- LINEUP ---- V/R 4.546 6 7 1 4 3 2 5 9 8 4.415 4.545 6 7 1 4 3 2 5 8 9 4.416 4.545 6 7 2 4 3 1 5 9 8 4.420 4.544 6 7 2 4 3 1 5 8 9 4.421 4.543 6 7 1 4 2 3 5 8 9 4.400 4.543 6 7 1 4 2 3 5 9 8 4.398 4.541 6 7 5 4 3 1 2 9 8 4.428 4.540 6 7 1 4 3 2 9 5 8 4.407 4.540 6 7 2 4 3 1 9 5 8 4.410 4.539 5 7 1 4 3 2 6 8 9 4.435 4.499 1 2 3 4 5 6 7 8 9 4.441
Again, we see a few of the lineups clustering the lefties toward the bottom of the order. Notice that in both the NL and AL, the difference between the best and the normal lineup is about .05 runs, which is much greater than the gaps we've seen before. Still, since that amounts to about a run every three weeks, it's certainly not a dramatic difference.
The worst lineups:
RUNS ----- LINEUP ---- 4.404 9 1 8 7 5 6 3 2 4 4.405 3 2 8 9 1 6 7 5 4 4.408 9 1 8 7 6 5 3 2 4 4.408 7 5 8 9 1 6 3 2 4 4.408 3 1 8 9 5 2 7 4 6 4.409 9 2 8 3 1 6 7 5 4 4.409 3 2 8 9 1 7 5 4 6 4.409 3 2 8 9 1 5 7 4 6 4.410 9 1 8 7 5 3 2 6 4 4.410 7 1 8 9 5 6 3 2 4
I wanted to look at one last thing in this article and that had to do with Barry Bonds. During the last few seasons, there has been quite a bit of discussion on how best to utilize him in a lineup. Do you bat him third or fourth or do something even more creative? I've heard suggestions that he should lead-off since opposing teams would be less likely to intentionally walk him. So I thought it might be interesting to analyze the composite lineup for the San Francisco Giants from 1999 to 2004. Instead of looking at their composite batting order, however, I looked at how they hit by position (and when doing this, I credited pinch-hitters to the position of the player they were hitting for). So here's how the Giants hit by position from 1999 to 2004:
AB H 2B 3B HR BB IBB SO HBP SH SF SB CS OBP SLG P 2882 536 105 5 36 176 12 787 20 282 20 10 7 .236 .263 C 3440 919 202 25 86 304 40 629 40 27 28 17 9 .331 .415 1B 3487 941 189 16 121 462 33 749 52 4 45 14 18 .360 .437 2B 3717 1078 239 34 139 386 24 604 36 25 45 62 42 .359 .485 3B 3637 964 190 14 92 349 17 480 34 32 36 20 13 .332 .401 SS 3706 1041 196 21 122 280 21 528 16 30 31 10 18 .332 .444 LF 3099 984 201 20 292 968 290 488 53 10 29 84 18 .483 .678 CF 3834 1030 200 29 106 347 11 632 36 19 17 105 44 .334 .419 RF 3551 956 180 31 149 475 28 766 28 17 32 67 35 .357 .463
Note that not all of the left-fielder's batting line is the responsibility of Barry Bonds, but enough of it is to make the line instantly recognizable as his. So what should the Giants have done with Barry Bonds? Here are the top lineups:
RUNS --------- LINEUP --------- 4.699 CF RF LF 1B 2B C 3B SS P 4.698 CF RF LF 1B 2B 3B SS P C 4.694 CF RF LF 1B 2B C P 3B SS 4.694 CF RF LF 1B 2B C SS 3B P 4.692 3B RF LF 1B 2B CF SS P C 4.690 CF RF LF 1B 2B C SS P 3B 4.690 CF RF LF 1B 2B SS C 3B P 4.689 CF RF LF 1B 2B C 3B P SS 4.688 3B RF LF 1B 2B CF SS C P 4.688 CF RF LF 1B 2B SS C P 3B
The worst:
RUNS --------- LINEUP --------- 4.332 P CF 1B C SS 3B RF 2B LF 4.332 C SS 1B P 2B CF 3B LF RF 4.332 3B CF 1B P 2B RF C LF SS 4.331 P SS C 2B CF 3B RF 1B LF 4.331 P SS CF 2B RF 3B C 1B LF 4.327 P SS C 2B CF 1B 3B RF LF 4.327 P SS CF 2B 3B 1B C RF LF 4.325 P SS CF 2B 1B 3B C RF LF 4.324 P SS 3B 2B CF 1B C RF LF 4.322 P SS 1B 2B CF 3B C RF LF
Oh well, it turns out that my method thinks the best position for Bonds was batting third, which is where he batted most of those years (he hit cleanup the last two and a half years of the period). Once again, I was hoping for something a little more creative. A lineup featuring him leading off would have been nice, but it didn't happen. Actually, at this point in the article, I would have settled for a lineup with him in the second slot. The most representative lineup they used during those years?
RUNS --------- LINEUP --------- 4.600 CF 3B LF 2B 1B RF SS C P
Which turns out not to be such a bad choice.
I did want to point out a potential problem with this method and that is: beware of small sample sizes. Notice that all of these studies made use of a rather large number of games. The smallest was the six seasons that went into the Giants study. I also ran my method against the 2004 Giants (since Bonds' performance seemed most extreme that season) and found a great deal of variation. Here are the best, traditional and worst lineup I found.
RUNS ----- LINEUP ---- 5.145 1 4 2 8 6 5 7 3 9 4.696 1 2 3 4 5 6 7 8 9 4.088 7 1 6 8 3 2 4 9 5
Now, these are the sorts of differences I was looking for!
In order to explain these results, I generated the statistical totals (scaled to be about a season's worth of games) for each of the three lineups. Let's start with the normal one.
RUNS ----- LINEUP ---- 4.696 1 2 3 4 5 6 7 8 9 PL AB H 2B 3B HR BB IB SO HBP SH SF OBP SLG 1 679 189 39 12 26 83 3 102 8 5 7 .360 .486 2 661 169 29 3 15 75 1 111 5 17 6 .333 .377 3 690 208 52 5 24 41 2 91 5 1 9 .341 .496 4 487 175 40 3 47 226 107 58 9 0 4 .565 .743 5 648 166 27 2 18 51 2 84 8 3 3 .317 .387 6 608 166 41 1 16 57 6 83 15 2 8 .346 .423 7 603 162 29 3 20 48 4 78 6 5 7 .325 .426 8 565 150 33 3 10 63 14 97 6 10 4 .343 .388 9 528 99 20 2 4 36 2 151 9 51 3 .250 .256
Not too surprisingly, this looks a lot like the 2004 Giants batting splits by batting order position. There are small differences. For example, the 4th spot in the order had 233 walks (and 115 of them were intentional) in real life. These are explained by differences in the situational mix due to the removal of base-running events. Many of these events cause a runner to move from first to second, a transition that dramatically increases the chance of Barry Bonds getting a free pass. So this is our base line.
Next up: the best lineup:
RUNS ----- LINEUP ---- 5.145 1 4 2 8 6 5 7 3 9 PL AB H 2B 3B HR BB IB SO HBP SH SF OBP SLG 1 689 191 39 11 27 82 1 103 9 6 5 .359 .483 4 531 204 49 2 58 228 82 66 9 0 3 .572 .812 2 647 176 35 5 15 77 1 122 5 19 12 .348 .411 8 632 180 36 3 11 85 24 102 8 8 7 .373 .403 6 627 176 43 1 19 59 8 83 18 2 12 .353 .443 5 633 164 27 2 21 55 1 84 5 2 2 .322 .408 7 610 169 28 3 22 54 6 81 7 4 4 .341 .441 3 601 183 46 3 23 34 2 83 3 1 18 .335 .506 9 537 103 23 2 4 37 2 150 8 50 3 .253 .264
Moving the cleanup hitter (Bonds) to the second spot increased his plate appearances, dramatically decreased his number of intentional walks, and caused his slugging percentage to jump 69 points. It caused the other hitters to improve as well. Except for the leadoff hitter, who stayed where he was in the lineup, every other regular improved their OPS in the new lineup. After Bonds, the largest jump in performance belong to the second-place hitter (most often Michael Tucker and Deivi Cruz) who raised his on-base percentage 15 points and his slugging percentage 34 points.
We'll look at reasons for these in a moment, but first, the worst lineup:
RUNS ----- LINEUP ---- 4.088 7 1 6 8 3 2 4 9 5 PL AB H 2B 3B HR BB IB SO HBP SH SF OBP SLG 7 700 192 34 5 25 53 5 84 5 3 4 .328 .444 1 652 163 34 9 19 76 1 110 5 11 4 .331 .417 6 647 165 44 1 12 61 5 90 13 2 6 .329 .382 8 613 165 33 3 11 74 19 102 7 9 3 .353 .387 3 640 192 49 4 24 32 2 85 3 1 12 .330 .502 2 577 141 24 3 11 70 1 97 4 11 5 .328 .354 4 445 161 36 2 45 188 78 54 10 0 3 .556 .755 9 516 91 17 2 3 33 2 152 9 72 3 .237 .234 5 543 132 21 2 16 57 3 78 7 1 3 .321 .378
Earlier in the article, I talked about how this method ignores batter protection, that fact that pitchers may alter the way certain batters are pitched to (or not pitched to, for that matter) depending upon the on-deck hitter. Well, here is an example where that assumption clearly causes problems. Note that Bonds is now hitting immediately in front of the pitcher, and that his walks and intentional walks are the lowest of the three scenarios. I'm pretty confident that this isn't even remotely realistic. Fortunately, you're probably not going to see these types of lineups except when dealing with the pathological bad ones.
One other thing to note with the last lineup is how poorly the leadoff hitter (usually Ray Durham) did. His on-base plus slugging percentage was down 98 points over how he did with the normal lineup.
So why did we see the results we did? Let's start with Bonds - why was his slugging percentage so much higher in the best lineup than it was in the normal one? This is primarily caused by how he did in two situations:
OUT FST AB H 2B 3B HR BB IB SO HP SH SF OBP SLG Norm Best 0 --- 143 41 10 1 14 34 2 19 3 0 0 .433 .664 1.026 .793 1 --- 64 30 6 0 7 23 6 5 0 0 0 .609 .891 .488 1.192 Where: OUT FST - the situation (outs, men on) AB ... SLG - The statistical line for that situation Norm - the number of time per game situation occurred in "normal" run Best - the number of time per game situation occurred in "best" run
These situations (bases empty with zero and one out) are very similar. Despite this, Bonds did much better when there was one out, due primarily (I believe) to small sample sizes. Moving Bonds up in the lineup to the second spot increased the occurrences of the one-out situation, while decreasing the chances of seeing the no-out one. So one of the reasons the Giants scored more runs with this lineup is that it was designed to take advantage of Bonds' ability to hit with no one on and one out, while minimizing the liability of his poorer showing when leading off an inning. This is fine, except that I don't for a moment believe these are really abilities. Instead, I think this lineup took advantage of an illusion created by a relatively small sample size.
Why did the second-place hitter improve? This was primarily due to the shift in frequency of the following two situations:
OUT FST AB H 2B 3B HR BB IB SO HP SH SF OBP SLG Norm Best 1 --- 193 45 9 0 4 22 0 29 1 0 0 .315 .342 1.195 .592 1 F-- 38 11 1 0 2 3 0 13 0 0 0 .341 .474 .233 .671
It turns out that dropping him down a spot allowed him to exploit an above average performance with a man on first and one out. Again, this was achieved in a small sample size (41 plate appearances).
Finally, why did the leadoff hitter do so poorly in the worst-case lineup? Here are the situations causing the drop-off:
OUT FST AB H 2B 3B HR BB IB SO HP SH SF OBP SLG Norm Worst 0 --- 297 86 16 4 16 27 0 40 4 0 0 .357 .532 1.848 .724 1 --- 91 17 3 0 2 13 0 15 0 0 0 .288 .286 .551 1.324
His excellent performance leading off an inning was minimized once he was moved to the second spot. This caused an increase in the frequency of the one-out bases empty situation, a poor one for the 2004 Giants' leadoff hitter. Again, I'm not sure why Tucker and company hit so well with no one out and yet so poorly with one out, but it's probably due more to luck than to talent.
Unfortunately, these small-scale subjects are usually what people care about. They don't want to know about a composite Giants team from 1999 to 2004, they want to know about a single season. People aren't concerned about generic teams, they want specific answers. How much did it cost the 1961 Yankees having Bobby Richardson leading off?1 Or having Horace Clarke in the same spot nine years later?
One way around this problem might be to combine simulation with this method. Take the team you're interested in and simulate 10000 or so games, collecting situational statistics as you go. Then feed this large sample of games into the Markov model and see which of the 362880 lineups work best for that team. Of course, this is probably easier said than done.
If anything, my approach shows that batting orders matter even less than people have believed. You would think that with such complicated forces at work here, some truly bizarre lineups might have been more efficient than the obvious ones used throughout the years, but if they exist, the methods described in this article didn't find them. That doesn't mean that specific teams haven't used illogical lineups in the past, only that one of those teams wasn't the San Francisco Giants over the past six years, and that it probably didn't cost these hypothetical teams a lot of runs anyway.
I did notice that there are a lot of lineups as productive as the traditional ones that would look very odd to players, fans and the sporting press. A lot of the lineups near the top of many of these lists feature pitchers (in the NL) hitting other than last, as well as other weird orderings. There are lots of instances of very different lineups producing almost identical results. But if the normal lineups do almost as well as these creative ones, is there any percentage in straying from conventional wisdom? I don't think so. And I guess that's the real conclusion of this article: since all but the most pathologically weird lineups produce just about the same number of runs, I might be inclined to select the lineup that makes the most intuitive sense to the players and fans. Simply put, it's not worth all the fuss you'd cause trying to be clever with lineups.
After I wrote this, I got to wondering what this method would do with the 1961 Yankees. Of course, these results shouldn't be taken too seriously, since the same sample size problems we discussed with the 2004 Giants also apply here, but here are the best (followed by the normal) lineups for that team:
RUNS ----- LINEUP ---- 5.096 3 6 7 5 4 8 9 1 2 5.062 7 5 4 3 6 8 9 1 2 5.059 6 7 2 3 4 8 5 9 1 5.052 7 3 1 2 4 8 5 9 6 5.047 2 4 8 5 6 1 7 3 9 5.047 3 5 1 2 4 8 9 6 7 5.045 7 5 1 2 3 4 8 9 6 5.041 3 6 8 5 4 9 1 7 2 5.034 1 7 2 3 4 8 5 6 9 5.032 7 4 8 5 6 1 2 3 9 4.575 1 2 3 4 5 6 7 8 9
And the worst:
3.871 2 1 6 9 4 7 8 3 5 3.889 2 1 5 9 7 6 8 3 4 3.913 8 7 6 1 9 5 3 4 2 3.914 9 7 4 1 5 2 6 8 3 3.918 8 7 1 9 5 3 6 4 2 3.921 2 1 9 5 7 6 8 3 4 3.921 2 1 5 6 9 7 8 3 4 3.929 2 1 5 9 4 7 8 3 6 3.934 2 1 9 6 4 7 8 3 5 3.934 2 1 5 9 7 4 6 8 3
Once again, we see a much larger spread with a single team sample. Here are the yearly stats for the normal lineup:
PL AB H 2B 3B HR BB IB SO HBP SH SF OBP SLG 1 709 179 21 4 5 39 1 46 2 10 4 .292 .315 Richardson 2 700 178 36 5 6 27 0 83 2 12 3 .283 .346 Kubek 3 619 178 17 4 60 92 0 66 6 0 6 .382 .619 Maris 4 568 178 18 6 57 132 9 116 0 1 5 .440 .667 Mantle 5 633 182 21 4 34 50 7 64 4 0 5 .341 .494 Berra 6 614 173 23 3 29 40 4 94 8 1 5 .331 .471 Skowron/Howard 7 571 157 25 5 24 64 11 95 6 1 4 .352 .462 Howard/Skowron 8 566 132 21 7 15 48 8 86 4 2 7 .294 .375 Boyer 9 531 90 9 2 6 37 2 127 3 32 5 .226 .228 Pitchers
And the "best" lineup:
PL AB H 2B 3B HR BB IB SO HBP SH SF OBP SLG 3 656 192 17 6 65 103 0 66 10 0 4 .395 .634 6 686 207 24 4 38 56 6 106 10 3 2 .362 .515 7 644 172 24 5 27 78 11 116 8 3 4 .351 .446 5 652 183 20 3 36 56 11 67 3 0 6 .338 .486 4 549 168 17 5 58 136 6 108 0 0 12 .436 .672 8 613 152 24 6 22 53 10 88 4 5 9 .308 .414 9 575 103 10 2 7 41 2 140 4 33 8 .236 .240 1 589 150 18 3 4 32 1 36 1 10 5 .292 .316 2 584 150 33 4 6 22 0 68 2 7 2 .285 .358
I thought it was interesting that this lineup had Maris batting leadoff and Mantle hitting fifth. Such a lineup could have given Roger the extra at-bats needed to break Ruth's record in 154 games. Also notice that apart from the first time at-bat, Maris would be preceded by Richardson and Kubek. Having the pitcher bat seventh is an odd touch, but typical of some of the quirky features of many of these lineups. I'm not convinced that this lineup was 80 runs better than the one they used over the course of a season, but it wouldn't surprise me if it was somewhat better. Considering that Ralph Houk typically batted his two worst-hitting regulars first and second all season, it shouldn't be too hard to improve upon what he did.