Picking Up Top Performers from T20 World Cup
Rasagnatha Garrepalli
Data Analyst | SQL | Python | Azure | GCP | Tableau | Power BI
Co-Author: Ravi Teja G
Introduction
Cricket analytics is a fascinating field that aligns with my interest in human performance optimization and sports science, coupled with a keen interest and proficiency in data science and machine learning.In an era where data reigns supreme, we embark on a journey to revolutionize the sport by seamlessly blending the realms of sports science, data science, and machine learning. This project is not merely about numbers; it's a dynamic exploration of insights that will redefine how teams approach player selection, game strategies, and overall performance optimization. Join us as we delve into the heart of cricket analytics, where every data point unveils opportunities for strategic excellence, talent identification, and unparalleled success on the cricket field. Get ready to witness the transformation of cricket through the lens of analytics – where every play, every decision, and every victory is backed by the power of data-driven intelligence.
Analysis of Cricket T20 World Cup Data
Extract:
The initial and challenging step in our Cricket Analytics project was the extraction of data from the ESPN site, accomplished through the use of a tool called Bright Collector. Web scraping, in general terms, refers to the automated process of extracting information from websites. It involves retrieving data directly from the HTML markup of web pages, allowing us to gather structured and unstructured data for analysis.
Bright Collector is a web scraping tool that enables us to navigate through the complexities of the ESPN site and systematically extract relevant cricket data. The challenge lies in the diverse formats and structures of web pages, as each site may have its own unique layout and coding. Bright Collector acts as a specialized agent, navigating through the web pages, identifying specific elements, and extracting the required data points.
The data collected from Bright Collector is structured in JSON format, providing a standardized and easily interpretable representation of cricket-related information, such as player statistics, match results, and team performance, essential for seamless integration into our Cricket Analytics project.
Transform:
After obtaining the data in JSON format from Bright Collector, we utilized Jupyter Notebooks and the Pandas library to seamlessly transform and analyze the information. Pandas, a powerful data manipulation tool, facilitated tasks such as cleaning, filtering, and organizing the data. Through Pandas, we executed operations like renaming columns, handling missing values, applying statistical computations, enhancing the dataset's structure and coherence. This process in Jupyter Notebooks allowed for an interactive and iterative approach, ensuring the data is well-prepared for the subsequent stages of our Cricket Analytics project.
Load:
After transforming the data in Jupyter using Pandas, we loaded it into Microsoft SQL Server (MSSQL) using the pyodbc library in Python. This process involved establishing a connection to the MSSQL database and executing SQL commands to store the refined dataset. Subsequently, in Power BI, we leveraged the MSSQL connection to create insightful visualizations. Power BI's seamless integration with MSSQL allowed us to build interactive dashboards and reports, providing a user-friendly interface to explore and gain actionable insights from the cricket analytics data.
Workflow Diagram:
Problem Statement
The challenge presented is the selection of an optimal cricket team based on comprehensive Cricket World Cup data. The task involves curating a squad of 11 players, maintaining a specific composition of 4-5 batsmen, 2-3 all-rounders, and 3 fast bowlers. Leveraging data visualization techniques of Cricket World Cup statistics, considering variables like batting averages, all-round performance indices, and bowling strike rates. Through effective data visualization, the goal is to highlight key player attributes, which enables us to make informed decisions in the team selection process. This task merges the analytical power of data visualization with the intricacies of cricket analytics, aiming to create a visually compelling and informative representation that aids in the strategic formation of an adept cricket team for the World Cup.
Analysis
Analyzing cricket data using Power BI and MSSQL Server offers a robust and integrated solution. Power BI, with its user-friendly interface, facilitates dynamic data visualization, allowing users to create interactive dashboards and reports. By connecting Power BI to MSSQL Server, the analytical process benefits from the database's relational structure, ensuring efficient data storage and retrieval. This synergy empowers users to explore and derive insights from cricket analytics seamlessly. The combination of these tools provides a comprehensive platform for in-depth exploration of cricket statistics, enabling us to make data-driven decisions and gain valuable insights into player performance, team dynamics, and strategic considerations.
Data-Modeling:
First step involves connecting MSSQL Server to Power BI, this streamlines data analysis. Users can establish a direct connection, import tables effortlessly, and seamlessly join them within Power BI. This integration enhances the platform's capabilities, allowing for real-time data access, comprehensive visualizations, and insightful analysis, making it a powerful tool for informed decision-making.
Key Measures:
In Power BI, key measures, created using Data Analysis Expressions (DAX), enhance analytical insights. For cricket metrics like Strike Rate, Total Balls Faced, and Bowling Average etc., DAX formulas are applied to the data model. For instance, Strike Rate can be calculated using a DAX formula dividing runs by balls faced, while Total Balls Faced can be a simple sum. Bowling Average, a ratio of runs conceded to wickets taken, involves creating a DAX measure. These measures bring dynamic calculations to Power BI, offering real-time insights into player performance, facilitating informed decision-making in the realm of cricket analytics.
Data Visualization
The initial phase of the project involved identifying the top-performing openers based on key metrics such as strike rate, batting position, boundary percentage, and total runs. This comprehensive analysis aimed to highlight openers who not only scored runs consistently but also demonstrated an ability to accelerate the innings through a high strike rate and an efficient boundary-hitting capability. By evaluating these performance indicators, the project sought to identify the most impactful and effective openers in the context of the specified criteria, providing valuable insights for strategic decision-making in team composition and match scenarios.
领英推荐
Openers Criteria:
Using the above parameters as filters, the following dashboard has been created in PowerBI which gives us the statistics of the openers.
Based on the provided batting statistics, Virat Kohli stands out as the most prolific batsman among the listed players. While other players like Jos Buttler and Alex Hales exhibit strong performances, Virat Kohli's outstanding batting average and well-rounded metrics position him as the standout and arguably the best batsman in this dataset. His ability to score consistently, coupled with an impressive strike rate, underscores his proficiency and impact in the realm of cricket batting.
Following the evaluation of top-performing openers, the subsequent phase focused on identifying middle-order batsmen based on a multifaceted analysis. Criteria included strike rate, batting position, boundary percentage, total runs, and a crucial factor: average balls faced. This comprehensive approach aimed to pinpoint batsmen who not only contributed runs effectively but also showcased resilience by facing a significant number of deliveries. The emphasis on average balls faced underscored the ability to anchor the innings and build partnerships in the middle order.
Middle-Order/Anchors Criteria:
Using the above parameters as filters, the following dashboard has been created in PowerBI which gives us the statistics of the Middle order Batsmen.
Suryakumar Yadav stands out with a high batting average, significant runs, and an impressive strike rate. Glenn Phillips combines a solid average with a high number of balls faced, indicating a balance between aggression and stability. Daryl Mitchell and Marcus Stoinis contribute effectively, each showcasing strengths in different aspects of middle-order batting.
Finisher/Lower Order Criteria:
Curtis Campher and Glenn Maxwell emerge as standout performers with impressive strike rates of 164 and 162, respectively. A high strike rate indicates their ability to score runs quickly, making them dynamic and impactful players in the middle order or as finishers. Sikandar Raza shines in the total runs category, showcasing consistent contributions with the bat. Additionally, his ability to hit boundaries efficiently is highlighted by having fewer balls faced for 6's and 4's.
In summary, Curtis Campher and Glenn Maxwell excel in strike rate, indicating their quick-scoring prowess, while Sikandar Raza's proficiency in accumulating total runs and efficient boundary-hitting showcases his well-rounded batting skills in different match scenarios.
All Rounders Criteria:
Rashid Khan stands out as an effective all-rounder with both wicket-taking and impactful batting contributions. Shadab Khan and Sikandar Raza demonstrate a good balance between bowling and batting, contributing significantly in both aspects.Mitchell Santner's efficient bowling and lower batting average make him a valuable asset in different match situations.David Wiese's bowling economy and batting strike rate reflect his dual proficiency in both disciplines.
Fast Bowlers Criteria:
Anrich Nortje stands out in bowling efficiency, conceding fewer runs and boundaries while maintaining a high dot ball percentage of 55. This indicates his ability to build pressure on batsmen by restricting scoring opportunities. On the other hand, Tim Southee and Sam Curran, although conceding more runs, display a commendable performance with relatively high dot ball percentages of 50 and 49, respectively. Their knack for inducing dot balls showcases their capacity to control the game and create challenging situations for the opposition, contributing significantly to their effectiveness in T20 bowling scenarios.
Conclusion
Our ideal cricket World Cup T20 team, drawn from insightful visualizations, would feature dynamic openers in Virat Kohli, Jos Buttler, and Alex Hales, ensuring a strong start. Suryakumar Yadav and Daryl Mitchell hold pivotal positions in the middle order, combining stability and aggression. Glenn Maxwell's finishing prowess adds a powerful dimension to our lineup. The all-round strength is bolstered by the dynamic duo of Rashid Khan and Shadab Khan, offering both bowling expertise and batting contributions. Anrich Nortje, Sam Curran, and Tim Southee emerge as our preferred fast bowlers, promising a formidable and well-balanced team ready to excel in the fast-paced T20 cricket format.
Thank you for taking the time to review my project! Please feel free to make suggestions or recommendations, and connect with me here on Linkedin. I am always looking to learn more and improve my Data Analyst skills. I am currently looking for opportunities as a Data Engineer, Data Scientist or Data Analyst. If you know of any opportunities in your network, please reach out!
Director - Sales & Marketing
5 个月https://youtube.com/shorts/aWBnw_iTtiQ?si=678G9-skDlSecc-i
Mechanical Engineer at Applied Materials
10 个月Good work!!
Software Developer @ Amazon
10 个月Excellent insights!! Great work Rasagnatha Garrepalli and Ravi Teja Garrepalli
Member of Technical Staff @ eBay | AWS Certified Developer
10 个月Great job ????