![]() We can only observe outcomes, but we care about talent. But RTM is accounting for the fact that you can observe outcomes that are not in line with a player’s true talent simply due to randomness and that going forward true talent is a better predictor. We must regress any new data toward the mean.Īs I noted, this is not a formulaic rule. So going forward, we expect the data to look more like the overall numbers rather than a single, recent sample. This means that any one section of data might not be a clear reflection of the underlying average. In other words, because of the randomness (factors unrelated to the talent of the player we care about) involved in generating baseball outcomes, it takes a long time for the statistics we create to tell us exactly how good a player is. Because we can’t measure true talent directly, we can’t say for sure when it changes and when we are simply observing a set of data points that are different from that talent level for unrelated reasons. Picture a line drive being caught by a leaping defender and a weak grounder finding a hole. Baseball has a lot of randomness that makes individual observations fluctuate around the player’s true talent. The idea behind using RTM in baseball is that we can’t directly measure true talent, we simply infer it from observing outcomes on the field. If a pitcher learns a new pitch, their history is still useful, but it’s much less useful than it is for a pitcher who is using their same arsenal. Players’ underlying true talent does change from time to time based on a variety of factors. It’s a conceptual framework, and like most conceptual frameworks there are exceptions. Keep in mind there is no “correct” way to account of RTM in baseball. ![]() So when a player gets off to a hot or cold start, we want to factor in RTM. To put it another way, any one small sample is less informative than a must larger sample even if the larger sample is slightly older. 400 OBP, the exact same properties would apply. It’s more likely that he will perform close to his career average (or some weighted version of it) than the sample of plate appearances immediately preceding the question. So to forecast his future performance, we need to consider RTM. We don’t just forget about them because our player had a bad April. But when we are asked to assess this player, the previous five seasons carry a lot of weight. Maybe he’s hurt, maybe he’s aging poorly, maybe the league learned to exploit a weakness. 300 OBP alter the way we think about the player and by how much?Īny sample of PA contain potentially useful information. What should we think about his next 500-600 PA based on the information we have? In other words, do those 100 PA at. 350 OBP hitter.īut now let’s imagine we observe his next 100 PA in which he posts a. He is, as best as we can tell, a true talent. Let’s assume the league’s run environment has stayed the same and the player is around 28, so there is no particular reason to expect his talent level to change or for his OBP to spike due to a clear external factor. Imagine you have a player with a career OBP of. Observations tend to cluster around the average value, even if the previous value is unusual. We are typically talking about the statistical concept known as “regression to/toward the mean.” Regression toward the mean (RTM for clarity in this article) is the concept that any given sample of data from a larger population (think April stats) may not be perfectly in line with the underlying average (think true talent/career stats), but that going forward you would expect the next sample to be closer to the underlying average than the first sample. However, this is not usually what we mean when we are talking about baseball statistics, so it’s important to be precise with your terminology. If a good player gets worse, they can be said to have regressed. The dictionary definition of this is something like “returning to a former or less developed state.” You will absolutely hear people use this word to describe baseball players. ![]() Colloquially, the word “regress” is often used to mean movement backwards. ![]() In conversations about baseball statistics, the word “regression” is used quite often, but there are essentially two different meanings associated with the word and it’s important to separate them because they mean different things.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |