The SAS Batting Lab. Hitting a data literacy home run with analytics, IoT, AI, and incredible participants!
Sports Industry Data and Analytics – 2023 Article Series Intro
Over the past couple years, I have enjoyed discussing sports industry data and analytics trends, hot topics, and various stories through my LinkedIn series.??
As I’ve mentioned in the past, analytics is not meant to rip and replace every facet of good old-fashioned decision-making, but to use data and analytics to help empower us to make better decisions, faster, and based on the facts.
In this article we’ll be looking at a project that touches upon these principles and does a great job helping participants understand the value of data and the impact that understanding it can have on their lives.
This first installment of my 2023 series will look at an amazing award winning project worked on by SAS this past year. Without giving too much away, I’ll leave you with this hint, a quote from one of my favorite movies and a story that was in part responsible for accelerating the use of data and analytics throughout sports. “How can you not be romantic about baseball.”?
Helping kids believe in data through the Power of Sports and 50,000 data points
As the Major League Baseball season and youth baseball and softball leagues are in full stride, we will touch upon the exciting SAS Batting Lab project. As we know when analyzing data, it’s important to take a look backward before we move ahead. And I’ll take this approach with this year’s article series. This was an incredible project, and I’m grateful to have been given a chance to play a small part in the success we had. A project managed by an amazing team of engineers, marketers, communications, and many others in the SAS ecosystem including partners like 美国北卡罗莱纳州立大学 athletics, McCann New York , Volvox Labs , Garner Sports League for their major contributions to this effort. And a triple play result in the kids improving their data literacy, understanding analytics, and of course, improving their batting swings!
Data Literacy – Driving Home the Power of Data?
One of the most important things SAS contributes to on a yearly basis is helping to drive positive change across the education landscape. We play a part in unlocking and demystifying data to help institutions of higher education, k-12, state departments of education, state and local government, and various education agencies understand their data to see learners of all ages succeed, and to empower the administrators, educators and all those involved with teaching.?
A statement made in a Batting Lab NBC story summed things up nicely by saying, “We don’t need all kids to grow up to be data scientists, we need them to be data believers.” I love that statement.?What I took away from it, is that not all of us are blessed with engineering and statistical brain cells (including me), but most importantly that if kid can buy into the importance of data, they will be empowered. Empowered to make better decisions, faster and based on facts.
Check out the SAS Batting Lab Data Playbook showcasing the same data and methodology used in the Batting Lab that provides kids with an approachable way to understand data by applying it to a sport they already know and love.
The Technology:
This state-of-the-art batting lab was setup as a next-generation batting cage designed with help from our partners at the McCann agency to captivate and engage kids and those of all ages.?In addition to the cage exterior, and check-in kiosk, SAS used the following hardware as part of the data collection process:
Additional and crucial components of SAS technology used in the Batting Lab included the following:
领英推荐
The Model Powering the Analysis:
As mentioned above I am not an engineer, and or programmer and like many in the industry, have been on a lifelong mission to “learn enough to be dangerous” in understanding and describing data, analytics, and technology. Fortunately, I have the pleasure to work with some of the brightest data, analytics, and technology experts here at SAS. One of those colleagues is Ji Shen , a Senior Research Statistician Developer in our Research and Development Division.?This section is inspired by and provides a glimpse of the modeling process described in a great article Ji Shen authored. ?
And for those eager to dive into the detail (statistics, modeling, scoring), please read his blog post about “The model powering the analysis”.
Malcolm Gladwell in his book the Outliers: The Story of Success, brings to light the concept of the “10,000-Hour Rule” in which it would take the average person 10,000 hours to master something (sport, profession, skill, etc.) In this case and specific to batting swings, collegiate and professional baseball coaches can analyze swings and come up with answers and prescriptive instruction to help improve swings. This is because of the years of experience they have with the sport. But in this case, we needed to create a machine that would serve as a coach to help provide instant feedback to the kids. And to achieve this we had to train the machine and help it to understand what a good swing looks like.
The Batting Lab uses Hidden Markov Models (HHM’s) that have been used in the past for things like speech recognition, facial expression, gene prediction, etc. Along with an optimization process, the HHM’s analyze each frame of the swing analysis in phases guided by a set of parameters that determine how the data is generated in each phase.
The model training and ultimately model selection in this project is heavily dependent on deciding how many phases are in the models and the initial parameters. The goal is to search for the models with different combinations of the number of phases and the random seed. The Bayesian Information Criterion (BIC) is used to select the model because the minimal model setting is preferred. We then saved our winning model in our Microsoft Azure Cloud environment.
In general terms the process of analytical scoring applies to the use of a predictive analytical model (in this case our model champion noted above), that is used to weight variables from the data set resulting in a specific scoring metric. ?The system used in the batting lab was based on the use of sensors and cameras (noted earlier in the article) to capture the batter’s movements.
The result being real-time scoring of the respective swing movements allowing our AI to provide instant feedback to help the kids make corrections as they continued through their batting lab sessions
Summary:
This massive effort was exciting to me for several reasons. Of course, it’s attachment to sports and America’s pastime was a driving factor. But after spending my first couple of years at SAS in what we call our “Education Practice”, I had the chance to work in part with former educators and those passionate about seeing learners succeed. At the end of the day, it's what this project was created for, and something we can all be very passionate about.
And lastly, being able to see several SAS colleagues across multiple business units come together with our partners and create something magnificent that will have a lasting and positive impact on the children who participated, their families, those that played a part in this magical experience, and all those at SAS that helped deliver the message.?
If you’d like to take a deeper dive into the SAS Batting Lab, please visit the SAS Batting Lab website and take a look at Ji Shen’s “The model powering the analysis”, and read Taking a swing at data literacy: an inside look at The SAS Batting Lab.
For more information on how SAS has impacted the sports industry, please visit https://www.sas.com/en_us/industry/sports.geo.html