Two main questions

  • Doc File 8,731.00KByte


M. Lawrence Clevenson

Jennifer Wright

M. Lawrence Clevenson, Ph.D., is professor of mathematics at California State University, Northridge. Jennifer Wright received her M.S. in Applied Mathematics at CSUN in August, 2007. Ms. Wright collected all of the data for this analysis. This paper was part of her Master’s thesis, under the direction of Dr. Clevenson.

Go For It!

It is Sunday afternoon in the late fall, and a football game has just started between the Green Bay Packers and the St. Louis Rams on the frozen tundra of Lambeau Field. St. Louis received the opening kick-off, moved the ball forward for some plays, but their third-down attempt to reach a first down failed, and left them with fourth down and two yards to go on their own 42 yard line. Without much thought—by the St. Louis Rams, the television announcers, or anyone as far as we can tell—the punting team comes onto the field and St. Louis punts. The decision to punt on fourth down is so common that hardly anyone thinks there is a question. Should St. Louis have punted, or should they have gone for it? One way to decide would be to quantify expected points after both decisions. But what does one need to know to know to do that? As we show, there are several parts to this decision, and we build statistical models to address these components.

For readers unfamiliar with American Football, the two adversarial teams try to advance the ball toward their opponents’ (defensive) goal line. The team with the ball (offense) gets 4 chances to advance the ball toward the other team’s goal. These chances are called “downs”. An important rule is that if the offense can advance the ball 10 yards or more before their 4 chances are over they get a “new first down” and four more chances to advance the ball. Often, if a team has reached their 4th down (their last chance) and feel that they cannot gain a new first down they will punt the ball (kick it) as far from their defensive goal as possible. Other times, a team will try for a field goal, which means they kick the football between posts at the end of the field, if they are close enough to their offensive goal line, a possibility we are not addressing in this paper. This means that we assume the offense has the ball at a point on the field where a field goal attempt is not a realistic option.

If the offense punts, goes for a field goal, or tries for the first down and fails on their fourth down, then the ball changes hands, i.e., the other team is the team on offense with the opportunity to advance the ball.

Fourth Down Decisions

We must examine three scenarios to make the decision to punt or “go for it”. Punting gives the other team the ball and a corresponding expectation of scoring from their new position on the field after the punt. Going for a first down and failing also gives the other team the ball and an even greater expectation of scoring, as they will be much closer to the goal line, the worst result of these three. Going for a first down and succeeding gives the offense a positive expectation of scoring, clearly the best of these three scenarios for the offense. Time may be a factor if near the end of the half or game. We assume the team on defense will have adequate time to try to score after the fourth down play. In these cases, a statistician would argue that the decision should be made by computing expected points (both positive and negative) from the decisions. Therefore, we need statistical models for these expectations.

Expected Points from a First Down

What are the expected number of points with a new first down from a given yard line? Let EP(x) represent a team’s expected points when that team has a first down with a given number of yards, x, from the opposing goal. We seek a model to estimate EP(x) for each team. There are 32 teams in the National Football League with various strengths in offensive and defensive play. Teams from Green Bay, San Francisco, St. Louis, Chicago and Indianapolis in the 2005 season were chosen for the study. This selection reduces the required data collection effort while maintaining a wide variety of strengths and weaknesses among the teams used to build statistical models. A qualitative summary of the strength and weakness in 2005 of these teams appears in Table 1.

|Table 1 - Summary of offensive and defensive team strength, |

|midseason 2005. |

|Teams |Offense |Defense |

|Chicago |Weak |Strong |

|Indianapolis |Strong |Strong |

|San Francisco |Weak |Weak |

|St. Louis |Strong |Weak |

|Green Bay |Medium |Medium |

All of the plays from all the games played by these five teams in the 2005-2006 were collected (). When a team had a first down (first down with 10 yards to go, or first and goal to go), a point in the data set for the team is created. Each point has a bivariate measurement. The explanatory variable (x) is yards from the goal line, and the response variable (y) is the number of points made before giving up the ball to the defensive team. The response variable result therefore is a member of {0, 3, 6, 7, 8}. Zero points means no score. Three points are awarded for a field goal. A touch down counts six points. After a touch down, a kick through the uprights adds one point (7) and a single play yielding a score like a touch down adds two points (8). Negative scores—a safety, or a fumble or interception returned for a touchdown—are rare and did not appear in these datasets. First downs after a penalty, e.g., first and fifteen or first and five, etc., were disregarded, since these all originally began with the standard first and ten. Four models were examined for predicting points from a first down. Models 1 and 2 used least squares to estimate coefficients. That is, coefficients were estimated to minimize the average squared difference between the observed value of y and the expected value. Models 3 and 4 used logistic regression. Again the coefficients were estimated to minimize the average squared difference between the observed value of y and the expected value, but the computation algorithm is more involved. The models are as follows:

1. Quadratic regression: EP(x) = b0 + b1 • x + b2 • x2.

2. Cubic regression: EP(x) = b0 + b1 • x + b2 • x2 + b3 • x3.

3. Linear logistic regression with the response being one of three events—no score (0), a field goal (3), or a touchdown (7): EP(x) = 0 • P(0) + 3 • P(3) +7 • P(7), where [pic]


P(7) = 1 – P(3) – P(0). For this model, we replaced actual points of 6, 7, or 8 with 7. There were actually very few such replacements, since there were few touchdowns that did not result in seven points.

4. Quadratic logistic regression: This is similar to Model 3, but there is a quadratic term in the exponential functions.

With the logistic regression models, expected points were computed using the value 7 for the event of a touchdown, which was the actual result in nearly all cases. There was little difference between EP(x) for the models 1, 2, and 4. Figure 1 shows all of the first down data for Chicago; the points have been jittered to display repetitions of cases. Three polynomial models for EP(x) and the almost identical fits of the quadratic and cubic models can be seen in Figure 2. Table 2 exhibits the best fitting polynomial and logistic equations, along with their R2 values. The interesting general consistency but slight variation in the expected fits is displayed in Table 3. We calculated the average points at intervals of five yards to see more clearly how the average points scored varies with the yards from the goal. These are displayed in Figure 3 with the chosen quadratic model.

Because of its greater simplicity, we chose a quadratic regression model to estimate EP(x) for each team. Of course, different teams had different coefficients resulting from the least squares estimates when EP(x) was fit to their data. The other teams had similar results.


Figure 1. Graph of data points for Chicago's points vs. first down position (yards from the goal) in the 2005 regular season. Points are jittered for viewing repetitions.


Figure 2. Graph of data points and models for Chicago's points vs. first down position (yards from the goal) in the 2005 regular season.



Figure 3. Graph of mean points at yard intervals noted on the x-axis and the fitted quadratic model for EP(x), for Chicago’s first down data for the 2005 regular season.

|Table 3 – Mean of actual data points, for Chicago, at the indicated intervals vs. the model’s expected points for the same interval|

|midpoint (2005 regular season). |

|Chicago |Count (n) |

Expected Net Yards for a Punt

When St. Louis punts to Green Bay from their own 42 yard line, the position from which Green Bay starts their next series of downs will vary with the effectiveness of the punt. The data for all the punts made for our five teams, in 2005, were examined. Figure 4 is a typical example extracted from for one play.

|4-19-CHI 3 |(12:34) B.Maynard punts 40 yards to CHI 43, Center-P.Mannelly. A.Randle El to CHI 42 for 1 yard |

| |(H.Hillenmeyer). PENALTY on PIT-S.Morey, Offensive Holding, 10 yards, enforced at CHI 42. |

Figure 4. Example of a play-by-play situation from the Dec. 11, 2005 game, Chicago vs. Pittsburgh. This is in the 3rd quarter of the game as extracted from the website.

This is how you read the summary from Figure 4: On this 4th down situation, Chicago punted 40 yards, and then Pittsburgh ran the punt reception back for 1 yard, bringing the net punt gain to 39 yards. A penalty was called (holding on the return team) for a loss of 10 yards. The net punt distance becomes 49 yards. Similarly, for each punt, the net punt distance was calculated as the distance from the starting punt position to the new punt position at the end of the play.

The example above was an effective punt, with a net punt distance of 49 yards. Most punts are less effective, and we chose to simplify this part of the analysis by assuming all punts net the average punting effectiveness for that team. Since the quadratic model for EP(x) is nearly linear, there will be little change in the expected points after a punt with this assumption.

Expected Yards When Getting a First Down

If St. Louis goes for it, and obtains a first down, what do they gain, on average? They gain the opportunity to score points on this possession. They will have a first-and-ten, some number of yards from the goal line. How many yards from the goal? That depends on how much yardage they gained, beyond the necessary two yards (recall it was fourth and 2 at their own 42 in our example). Data for all of St. Louis’ successful attempts at a first down from third down positions showed that, on average, they gained approximately 8 more yards than the first down marker. We use this value to estimate St. Louis’ position after a successful first down. The analysis will change little if we use the detailed distribution since EP(x) is nearly a linear function. In addition, the quadratic function EP(x) is convex, and so Jensen’s inequality says that this analysis understates, slightly, the value of a successful first-down attempt. A study looking at variability as well as expectations would need to address this issue more carefully and could be possible future work. Unsuccessful attempts usually result in not much change from the current position and were not analyzed separately. That is, it is assumed that an unsuccessful attempt delivers the football to the opposing team at the line of scrimmage, where the 4th down play started.

Probability of a Fourth Down Conversion

If the offense does not punt on 4th down, what is their probability of successfully achieving a new first down? How do we answer this question? Teams rarely try for a new first down on fourth down, and thus not much data exist on fourth down conversion attempts. Of course, teams always try for a new first down on third down. We decided to use success rates on third down and fourth down conversions together to model the probability of a successful conversion on fourth down attempts. While defensive teams might try even harder to prevent fourth down conversions (by risking longer gains in an all-out attempt to stop the conversion), they already usually align their defense to prevent third-down conversions. We believe the fourth down conversion rates would be quite close to the third down conversion rates.

The cases are again bivariate, with the explanatory variable being yards to go for a new first down, and the response variable being success or failure. Since the response variable is binary, some logistic regression models were compared. The linear logistic model gave approximately the same estimated probabilities as the quadratic logistic model, and so was chosen for its greater simplicity. Figure 5 exhibits the quadratic logistic probability graph for Indianapolis, shown together with the actual relative frequency of success (for Indianapolis).


Figure 5. Graph of relative frequency of successful first down conversions and probability model.

Comparing the Expected Points

Recall that we are questioning St. Louis’ decision to punt on “Fourth and Two” from their own 42 yard line. For St. Louis, the average net punt is 34 yards. From St. Louis’ perspective, if the punt nets 34 yards, the St. Louis average, then Green Bay will be 76 (42 + 34) yards from the goal and have EP(76) = 1.320, expected points. Punting puts St. Louis down, on average, 1.320 points.

If they go for it, they need 2 yards, and, when successful, they average 8 additional yards, and so, on average, a successful attempt gains 10 yards. This would leave them 100 – 42 – 10 = 48 yards from the goal. Their expected points from this position are 2.635.

However, the previous analysis asked what Green Bay’s scoring potential was when they received the ball after a punt. For proper comparison, we need to compare that with the net average points when Green Bay next receives the ball, regardless of what St. Louis does, and how many points they score. Remember, we are considering scenarios in which the team on defense will have enough time to try to score as will the offense. So the gain from a successful fourth down conversion has to be decreased by Green Bay’s scoring potential on their next possession. Of course, we do not know where they will start that next possession. Assuming St. Louis does score a touchdown or field goal, the average position would be approximately the 25 yard line. The exact position is not so important, because EP(x) changes little when x is large. Green Bay’s EP(75) is 1.338. Thus St. Louis will achieve a gain, by successfully making a first down, of 2.635 – 1.338 = 1.297.

The linear logistic model for St. Louis shows that their estimated probability of a successful fourth down conversion, at 2 yards to go, is 0.576. So St. Louis has an expected loss, by going for it, as follows:

P(Failure)•(Expected Points for Green Bay 42 yards from the goal)

– P(Success)•(Expected Points by Successful Conversion)

= (1-.576) • (2.548) + 0.576 (-1.297) = .333 (expected points behind, the next time Green Bay has the ball). Recall that they expect to be down 1.320 points by punting. St. Louis gains an average of almost a point by the decision to go for it in this situation.

Notice that this analysis shows that the correct choice is to “go for it,” and punting provides an expectation that is not close to the expected loss from attempting to convert a first down. Yet, with prevailing understanding of NFL games, if St. Louis went for it, and failed, they would be strongly criticized by every NFL expert for “gambling” or “not playing the percentages” or being “wild risk takers”. Pundits (punt-its) might even say, “They should have gone with the percentages.” But our analysis shows that to “go for it” is the percentage play, and many experts probably have never looked at any percentages. If they go for a first down, they have a reasonable chance (58%) of keeping the ball and thus scoring some points with this possession. And their expectation goes from -1.320 to -.333, or from clearly negative to almost even. The data show that the decision to “go for it” is the “percentage play.”

Comparison of Teams

We computed the values of expected points, EP(x), using each team’s quadratic regression model, for the five teams for which we collected data. To obtain an idea of how optimal fourth down choices vary from team to team, we chose to average our EP(x) values for the five teams studied, when computing the “other” team’s scoring potential. The table uses EP(x) for the specific team in the following tables when that team is on offense.

The tables on the following pages show similar calculations to the one done above for St. Louis, with 4th and 2 at their own 42 yard line, for all cases. The columns indicate yards necessary for a first down (1 to 20 yards), and the rows specify the yards from the goal line (30 to 99). Teams should not punt when they are within 30 yards from the goal line. The relevant decisions then are “go for it” or “try a field goal”. The values in the cells of the tables are the differences between the expected points for punting and expected points from attempting to get a first down. An X in a cell means that the situation is not possible. Grey-shaded areas indicate situations in which the team should punt (positive expected difference, or, in other words, the expected loss for attempting a first down is larger than the expected loss for punting). Table 4 gives results for Chicago. Table 5 gives results for St. Louis. Results for the other three teams used in the analysis are included in the supplementary online material.

Interestingly, our results show that even a poor offensive team like Chicago should go for a first down more often than they actually do. For example, on fourth and one at midfield (50 yards from the end zone), this analysis shows that Chicago should go for a first down. Intuitively, this may be more obvious than NFL coaches seem to realize. They have an estimated probability of success of 0.4747, about fifty-fifty. So essentially, they are taking a fifty-fifty shot at having nearly the same situation as their opponent, which, on average should favor neither team. That is in contrast to the opponent having the ball, albeit deep in their own territory, an advantage to the opponent, as they have the potential to score, a positive expectation. So going for it is not disadvantageous, and punting produces a disadvantage. Offensively strong and defensively weak teams like St. Louis should go for a first down even more often. These teams have high expected points when they have the ball, and high expected points for the opponent when the opponent has the ball. They also have higher probabilities of successfully converting a fourth down attempt for a new first down. They should try to keep the ball.

Table 4 – Expected difference in points for going for it versus punting on fourth down for Chicago. Yards from the end zone and yards to go for the first down are on axes. Points are more for going for it in the grey area. Impossible situations are marked with X.


Table 5 – Expected difference in points for going for it versus punting on fourth down for St. Louis. Yards from the end zone and yards to go for the first down are on axes. Points are more for going for it in the grey area. Impossible situations are marked with X.



Our analysis shows that there are many situations where the correct decision on fourth down in the middle of the field not late in the game is to go for a first down. This analysis assumed that it was early in the game, and the correct decision would be determined by expected points. Coaches may be more comfortable with punting when the expectation analysis shows that they gain a small amount by the choice to try to keep the ball. After all, if the team fails on an attempted fourth down, then the coach likely will receive some criticism. So, for factors beyond those considered in our analysis, they may want to widen the grey areas to punt. However, even allowing a margin of something like 0.25 points, 0.5 points, or whatever, the coach could make a better decision in many circumstances by considering the option of “going for it.”

Our analysis only applies when the game is not close to finishing. Near the end of the game, models for expected points should be replaced by considering the models for probabilities of particular results—no score, a field goal, or a touchdown—probabilities that we have modeled in our analysis.

At the beginning of the season, teams would be without the data we used to analyze fourth down decisions to “go for it.” To use our analyses, the decision makers might try to find the team of our five most similar to their team with regard to offensive and defensive strength. As their season progresses, they could then use the data from the current season.

Similar analyses to compare kicking a field goal with going for a first down were done by Ms. Wright in her Master’s thesis at California State University, Northridge. Again, she found that the decision to go for a first down or touchdown, rather than kick a field goal, should be made more often than it is.

Further Reading

Agrestt, A. (2002), Categorical Data Analysis (2nd Edition), John Wiley & Son, Inc.

Bartshe, P. (2005), “An NFL Cookbook: Quantitative Recipes for Winning,” STATS, 6, 12-13.

Myers, R. (2000), Classical and Modern Regression with Applications (2nd Edition), Duxbury Press.

Sackrowitz, H. (2000), “Refining the Point(S)-After-Touchdown Decision,” Chance, 13, 29-34.

Stern, H. (1998), “Football Strategy: Go For It!” Chance, 11, 20-24.

Theismann, J. and Tarcy, B. (2001), The Complete Idiot’s Guide to Football (2nd Edition), Alpha Books.

NFL Football Data, , Nov. 2005 – Jan 2006






Online Preview   Download