An inherent goal of any person or team playing in a sport is to win. In professional sports, not only does winning have intrinsic reward for the competitors and fans, but there may be great economic incentive as well. For example, Major League Baseball teams presently generate tens or hundreds of millions of dollars annually in revenue from ticket sales, concessions, licensing, and television contracts, and have annual payrolls averaging about $100 million per year. These revenues, and ultimately the economic success of MLB teams often depend to a significant extent on the teams' frequencies of winning and losing baseball games.
For many decades, the statistics of baseball have been deemed mathematically interesting in and of themselves and relevant to the enjoyment and understanding of the game. As such, baseball statistics have been studied increasingly intensely over the years, with new insights and strategies often being gleaned from new forms of statistical analysis. Once primarily an idle pleasure for fans of the game, baseball statistics have become even more relevant in the modern game and business of baseball, as these advanced insights permit teams to compete more favorably on and off the field of play.
To a larger degree than many other sports, baseball lends itself to generating valuable statistical information about individual players, particularly statistics pertaining to offensive performance. Unlike football, basketball, hockey, or soccer, for example, baseball has relatively little interaction between offensive teammates in the act of scoring runs in baseball. While offensive teammates in other sports may pass the ball or puck to one another to create or improve a scoring opportunity, a batter stands relatively isolated from teammates while in the batter's box, facing the opposing defense, most notably the pitcher. Also, while defense in baseball is more of a team-oriented endeavor than offense, the widely accepted importance of the pitcher and his pitching ability dominating the defensive contribution gives some individualism to the pitcher's efforts as well.
Traditional statistics in baseball have included both accumulative statistics, such as a batter's home runs (HR), triples (3B), doubles (2B), hits (H), runs batted in (RBI), and runs scored (R), and frequency measures, such as, most famously, batting average (AVG). Both types are common with pitching statistics as well, with innings pitched (IP), batters faced (BFP), strikeouts (K), walks issued (BB), hits allowed (H), earned runs allowed (ER), wins (W), losses (L), and saves (S) being some of the more well known accumulative pitching statistics, and earned run average (ERA), the average earned runs allowed per nine innings pitched, being the most famous frequency measure for pitchers. Many other statistics have been developed over time, and some have become more recognized, understood, and accepted by even more casual fans of the sport. Traditional accumulative defensive statistics (not including pitching) have included putouts (PO), assists (A), errors (E), and total chances (TC), the sum of the first three. Using these statistics, the frequency measure of fielding percentage (FP) may be calculated.
More recently developed statistics have attempted to address some of the vagaries and inequities of the traditional statistics when used for comparing players to one another statistically for the sake of determining which player is better and by how much. With regard to offense, frequency statistics like on-base percentage (OBP), slugging percentage (SLG), and their sum, on-base plus slugging (OPS) have been recognized by many as superior to more traditional frequency statistics like AVG because the newer ones do not ignore the number of walks a batter has earned or the type (number of bases) of hits a batter has achieved.
Statisticians have found certain ways to normalize these statistics, such as trying to account for certain inequities present in the environments under which different players were playing. For example, baseball has the unusual trait of having a field of play that is somewhat loosely defined by rules. While the bases and pitching rubber are positioned at fixed distances and angles relative to one another, fields may vary considerably in terms of the distance to the outfield fence, the height of the fence, the amount of playable foul territory, the quality of the hitting background (the background a batter sees for visual contrast as the pitched baseball approaches him), and other environmental conditions such as air temperature, humidity, wind, lighting, etc. In an attempt to provide more comparable statistics for players, some have created “ball park factors” meant to indicate how easy or difficult it is to score runs or hit home runs in certain ballparks, using the ballpark factors to adjust other statistics into a more fairly comparable state.
Defensive statistics have notably lagged offensive statistics in terms of their ability to reflect a player's contribution. While known offensive measures still have room for improvement, the traditional “best” defensive statistic of fielding percentage, a number that reflects the frequency that a particular player commits an error relative to the total number of “chances” he faces, is now widely recognized as having particularly significant shortcomings. Among these is the fact that identifying a play result as an error is a very subjective human decision made by a different official scorer in each ballpark. Also among these shortcomings is the fact that players who have greater range and get to more balls (i.e., have higher TC numbers) often have more errors because their chances are, on average, more difficult than those with less range who handle only balls hit more directly to them. The additional range, however, may more than offset a lower fielding percentage.
Many advanced statistics have been created to normalize one or two statistics relative to one or two variables in order to achieve “fairer” statistics for comparison purposes, especially on offense in baseball. Many times, however, other factors that have not been normalized continue to plague the “normalized” data and still render it dubiously fit or completely unfit for its desired purposes. On defense, some have tried other techniques, with disputed degrees of success. There have been a number of variants of the “Zone Rating”, for example, where it is attempted to measure what percentage of balls hit within a particular topographical zone on the field, defined as the responsibility of a particular fielder, such fielder manages to successfully field. The problems tend to come in defining the zones and normalizing for teammates, ballpark, and other variable conditions. If one were trying to rate a centerfielder using this method, for example, one can define a particular area on the field (often done with polar coordinates—i.e., angles and distances from the back corner of home plate), but often a centerfielder may start to the left or right of the center of that area based on a hitter or pitcher's handedness or skill set. Similarly, he may start shallower or deeper than would be ideal to cover the pre-defined area. Typically, for example, fielders adjust their starting points from batter to batter as the batter-pitcher match ups change and the base-out situations change. Often, a fielder even moves within a single batter's plate appearance as the hitting count may affect various strategies and probabilities. A player may even start a play completely outside the zone he is being statistically measured to cover or not within any of the multiple zones he may be assigned to cover. When certain “extreme pull” hitters bat, for example, based on that hitter's tendencies, the fielders may radically shift position on the field.
There is no widely accepted valuable measure of team defense in baseball, and many rely on very crude data, such as the total number of errors committed by all the players on the team, the total fielding percentage of the team, or the total number of unearned runs allowed by the team. Thus, there is a significant need for better evaluative statistics relating to individual and team defense in baseball and cooperative team activities in sports generally. In particular, there is a need to achieve normalization with regard to a wider array of variables that may dissimilarly affect different players and different teams.
Other sports also have these issues, and basketball is another in particular that has a great need for enhanced statistical analysis to evaluate players, especially given the nature of play in the sport and the economics behind professional basketball. Many accumulative and frequency statistics have been used and developed for individual players for basketball as well, and some teams statistics have been developed as well, though many are just the sum of the statistics of the teams' players. Points scored (P), offensive rebounds (OR), defensive rebounds (DR), total rebounds (R), assists (A), steals (S), blocked shots (B), free throws made (FTM), free throws attempted (FTA), fouls committed (F), minutes played (M), games played (G) are examples of accumulative statistics used in basketball, while field goal percentage (FG %), free throw percentage (FT %), points per shot (PPS), minutes per game (M/G), points per game (P/G), rebounds per game (R/G), and assists per game (A/G), steals per game (S/G), and blocked shots per game (B/G) are examples of frequency statistics used in basketball.
Basketball, however, is a very team-oriented game, and when a particular player changes teams, it is common to see big changes in the statistical results or contribution from such player. Different players have different skill sets that may or may not work well together as the players pass the ball to one another on offense looking for a shot that maximizes the team's scoring expectation on a particular possession of the basketball. Some teams may have many players proficient at shooting, but not enough that are proficient at making passes that get the shooters “open” for shots. Other teams may excel in getting open shots, but don't have the shooting expertise to convert these opportunities with sufficient frequency to compete effectively. Thus, it is very difficult to compare the overall value or contribution of individual players because their roles may vary on different teams, but also on the same team as different teammates are substituted for one another within a game. Also, a particular player may look more productive on one team than another, say a good shooter who may have good passers on one team but not another. Thus, the conventional accumulative and frequency measures do not do particularly well at effectively comparing the relative abilities of individual players. As a general manager of a professional sports team, however, these are the types of comparisons that must be attempted to try to build a winning team, especially given the constraints of payroll or league salary caps. Thus, there is great need in very team-oriented sports, such as basketball, to develop improved measures of individual player contributions to facilitate building a better team.