This video provides a beginner-friendly introduction to MLB's Statcast system and the Baseball Savant website. It explains how Statcast uses high-speed cameras and data to analyze player performance, focusing primarily on hitting and pitching. The video demonstrates how to use Baseball Savant to explore individual player statistics, expected metrics, pitch movement, and zone performance, illustrating these concepts with examples of Major League players.
Major League Baseball and baseball as a whole is driven by numbers. Since the early 2000s and the rise of Moneyball, baseball has paved the way for analytics in sports. Not only is baseball at its core a simple battle between pitchers and hitters, it's a game where minute differences can make a player worth $100 million and someone else only 10. Also, the fact that every pitcher and hitter's goal is the same thing every time. It makes each player relatively easy to compare to one another. Sure, every player is different, but ultimately every pitcher is rated by how well they get outs and every hitter how well they get hits. For the sake of this video not being many hours long, I'm not going to delve into the depths of fielding and base running analytics, but just realize that hitting and pitching are the two most important parts of the game. And every play that happens in baseball starts with those two players. With that being said, this is how to [Music] stackcast. Stackcast is the technology and data that is collected by at least 13 high-speed cameras set up in every Major League Ballpark. Statcast goes handinhand with Baseball Savant, which is the beautiful website run by the MLB that turns the raw Statcast data into graphs and charts that us fans and analysts can understand. When I talk about the Stackcast or Savant website, it all means the same thing. When you get to the Savant homepage, you'll be greeted with some articles and league news along with the day slate of games. The first way to use Stackcast is by looking at individual players. And like everything else, there are layers to how deep you can go with this. For example, let's look at one of the best first baseman in the league, Bryce Harper. Once a first pick outfielder for the Washington Nationals, Harper is now killing it at first base in Philadelphia. We can look at his dominant 2021 season where he somehow wasn't considered a top eight outfielder in the National League and didn't make the All-Star team. Statcast shows standard stats like all-star appearances, batting average, and other common statistics and color codes them with darker red, meaning a higher percentile in the league. By hovering over it, we can see that Harper was third in the NL and batting average in 2021, hitting .309. However, the real magic is up top where Statcast uses actual data to provide expected metrics. For example, XBA or expected batting average uses Statcast batted ball history to compare to. Every batted ball since 2015 has been tracked in Stackcast. And now every new ball that is hit can be compared to one in the past in terms of exit velocity, launch angle, and the sprint speed of the batter. Along with actual strikeout totals, Stackcast can use data to come up with the result of XBA based on the quality of contact instead of the actual outcomes. Hitters and likewise pitchers are able to influence exit velocity and launch angle, but have no control over what happens to a batted ball once it's put into play. So these expected metrics can help account for disparities in defense shifts, weather, ballparks. For all of these advanced metrics, we can see them change yearbyear along with a line graph for the expected weighted on base average and a spray chart of actual hits. If we click on one of these hits, let's say this home run, we can watch a replay of the clip from home and away broadcasts. And on the right, a bunch of info about the pitch. Number two in the first, number eight in the fourth. And Harper hits one out to deep center. Jacob Young chasing and this game is tied. You just can't make it up. Bryce Harper only his second career at bat against Lucas Sims and he takes him way out to dead center. and Bryce having a nice This home run went 415 ft and would have been a home run in 25 out of the 30 MLB ballparks which if I click on I can see the five stadiums with deep center fields that Harper's hit would have stayed in the yard. Going back to his main page, we can look at split stats and game logs for Harper. Like how last year he only ever hit third in the Phillies lineup and he batted almost 50 points higher in batting average at night than during the day. and his stats for literally every game of his career. This is what makes Statcast so valuable, the immense amount of data that is available. And with every new data point that is acquired from new games, Savant's expected stats just get more and more accurate. They're also constantly adding new features like the bat speed metric that was just added in 2023. We can also look at some of these player apps like shifts, which is less relevant now that the MLB added stricter shift rules, but we can still see that center fielders play Harper a little towards left and the whole infield is shifted to the right side. Clicking through Harper's spray charts shows that he is good at hitting to all fields, but taking a peek at his zone swing profile shows that he's a lefty hitter that swings at high inside pitches more than anything else, and his batting average correlates. We can also click on his swing take profile and see that Harper chases pitches outside of the zone, 12% higher than league average. And in turn, pitchers throw him less pitches in the strike zone than league average. Last year, Harper's chase percentage was in the 19th percentile, one of the worst facets of his game. But obviously, he still walks a lot and has no trouble hitting for a high average. No player is perfect, and Stackcast is the perfect tool for analyzing players weaknesses. If you want to dive into very specific zones, you can go into the illustrator where you can add, let's say, the last five years of data, change the zone to batting average and the minimum pitch speed to 96. There are still a couple zones where Harper happens to do well, but overall his average on high velocity pitches is only 217. If we bump the minimum velocity down to 86, 10 mph slower, you can see his zones heat up and his batting average jump up to 304. We can even click on specific pitchers and see that against Sandy Alcantara in the last 5 years, Harper has only hit 261 with one home run. We can also check out stuff like his pitch heat map which shows pitchers smartly pitching him down in away. We can also look at his radial chart displaying launch angle and see which angles give Harper the most home runs, singles, and outs. The upper gray area is flyyous and the lower gray area is ground outs. That'll be it for hitters. Obviously, you can go as far as you want into this, but that's a pretty good basic overview of batters in [Music] Stackcast. With pitchers, believe it or not, it gets even more complicated. There's so many different formulas for being a successful pitcher. And so, it makes them more difficult to compare to each other. How can you compare a pitcher like Kyle Hendris that throws an average of 86 mph to a pitcher like Justin Martinez that averages over a 100? Hrix was top five in wins three times in his career and his different approach may have actually helped him with longevity as he's in his 12th season of pitching in the major leagues. There are pitchers that only use two pitches like Kenley Jansen and pitchers that use many pitches like Udarvish that uses at least seven different types. Just as an example to show you the basics, let's take a look at 2024 Corbin Burns. He was one of the best starting pitchers in baseball last year, sporting a 2.92 RA and a 94th percentile total run value, preventing 22 runs more than league average. On the right, instead of the batter spray chart, we got a pitcher's movement profile. This is the perfect summary for how effective each pitch is for a certain pitcher. For instance, Burns over the top 44 degree arm angle helps him make his famous cutter have 12.5 in of rise on average, which is over 4 in more than the average right-handed pitcher. Burns uses his cut fastball 45% of the time with the rest of his arsenal being made up of a curveball slider combo. Also, 20% of the time he throws a pitch with opposite movement displayed by the green and orange here, which are change ups and sinkers. All of Burns's pitches have significant more movement than league average pitchers, which combined with his 97 mph average fast VO, and his pinpoint control makes for one of the best pitchers in the major leagues. Just like the hitter spray charts, we can click on any one of these pitches and view a clip like this filthy cutter to strike out Gian Carlos Stanton. Swing and a miss. Stanton down on strikes. No runs, no hits, no errors, and nobody left. If we go deeper into this game, we can see that Stan actually homered off of Burns earlier in the game when Burns tried to throw his least common pitch, a sweeper, twice in a row. He only throws those 2% of the time. He tried to trick Stin and he wasn't fooled. High-fly ball deep left. Hower back on the track at the wall. See youa. Home run state. One- nothing Yanks. If you want to see pitchers arsenals a little more realistically, we can see their heat maps and spin types below. This helps see where pitchers like to throw in the zone and how accurate they are. Of course, Burns is elite, but we can look at someone like this year's Roki Sasaki who is struggling with his control. We can also look at pitch tracking by season. For example, Burns so far this season is throwing 61% cutters, more than in any other season of his career. We can see he also came into the league throwing a fourseamer and then replaced it with his cutter in 2019. And obviously, the rest is history. The last basic step of using Statcast is the leaderboards tab, which can help you compare many players stats at once, even expected stats. We can see the highest wobbas so far this year and then the highest expected wobbas. The differential shows you the players that are overperforming the most based on their expected stats. And if we click it again, it shows the players that are underperforming the most and should start seeing better results. This is where players can seem quote unquote lucky and quote unquote unlucky. But sometimes players can always overperform against their expectations. For example, the fourth luckiest player last year was Geraldo Pomo, a light-hitting shortstop for the Arizona Diamondbacks. However, clicking on his profile shows that he never swings at bad pitches, always hits the ball off the sweet spot, and is simply very annoying for pitchers to face. Even though he doesn't hit the ball very hard, Purdomo is elite at finding other ways to get on base, hence why he outperforms his expected stats. There are a whole bunch of leaderboards that show various stats. But the key takeaway from this is that no single statistic can describe a whole player or group of players. To truly master Statcast and understand how good or bad a player is, you have to look at many different stats and types of data to come to a conclusion. There's also many different types of visuals that help display other specific advanced stats. In the visuals tab, there are dozens of different data points that range from batting stances to animated travel schedules. Stackcast is meant to be fun and help baseball fans get more out of the game that they love. Also, if you're playing fantasy baseball or making bets on games, using StackCast can help you gain an edge over your opponents or sports book. I hope you enjoyed this introduction on the basics of Stackcast and Baseball Savant. I'm Josh Davis, the analytical lion. You can check out some of my other content on the channel about Penn State football and Penn State basketball, which is what I specialize in. Feel free to leave a like and subscribe. And please check out Stackcast. Go play around with it. The best way to learn is by experiencing it and clicking around and then seeing how that correlates to actual baseball games. That'll be all. I'll see you around.