登录查看更多内容

Beneath the Surface A Python-Based Mining Data Analysis

Omhari Gurung

Data Analyst | SQL | Tableau | Excel | Data Visualization

发布日期: 2024年12月7日

Have you ever wondered what happens beneath the surface of a mining operation? Just like a detective piecing together clues, I found myself diving into a world of data to uncover the mysteries of a flotation plant. As a data analyst, I was tasked with a project that revolved around June 1, 2017—a day marked for unusual activity. The stakes were high, and I was curious to see what secrets the data would reveal.

Why THIS Project?

The inspiration for this project came from the critical need to understand the operations of the flotation plant on that particular day. The mining company flagged June 1 due to unexpected fluctuations, and they wanted to know whether operational inefficiencies, equipment issues, or other factors might have contributed to this moment. My background in data analysis and passion for solving complex issues made this project not just a job, but a chance to contribute to the mining industry’s efficiency.

What Readers Will Gain

In this article, I’ll share insights from my analysis, discussing the findings and the journey I took to reach them. You’ll learn about the data I used, the analysis process, and the surprising results I encountered.

Key Takeaways

No anomalies were found on June 1, 2017.
There’s an inverse relationship between % Iron Concentrate and % Silica Concentrate.
Ore pulp pH levels were mostly stable, with a high frequency between 9.9 to 10.1.
Surprisingly, the main variables had no significant correlation, suggesting other influencing factors.

Dataset Details

For my analysis, I utilized a dataset sourced from Kaggle, which contained a staggering 737,454 rows and 24 variables spanning from March to September 2017. This dataset was perfect for the project as it provided minute-by-minute and hourly data, giving me the granularity needed to uncover the nuances of flotation plant operations.

Library Installation

I used Deepnote, a browser-based Integrated Development Environment (IDE), to explore and visualize the data by writing Python code for analysis and visualization. Before starting, I installed and imported three libraries: Pandas for data manipulation, and Seaborn and Matplotlib for creating visualizations.

Next, I connected to the dataset in Python using Pandas' DataFrame structure and reviewed a preview of the data.

Analysis Process

The analysis began with data cleaning, which was crucial for ensuring accuracy. I noticed that some decimals were incorrectly formatted with commas. This required me to replace commas with decimal points.

Additionally, I had to redefine the data types of certain columns to ensure they were in the correct format for analysis.

The date column was imported as a string. To fix that, the below code will redefine the column to a date using the _datetime() function.

Once the data was clean, I created summary statistics that provided key insights into average, median, minimum, and maximum values. The focus was particularly on the first week of June, where I filtered the data to hone in on the variables of interest: date, % Iron Concentrate, % Silica Concentrate, ore pulp pH, and flotation column levels.

Were there any anomalies on June 1, 2017?

Management indicated that something unusual occurred on June 1, 2017, and requested an investigation. To begin, I filtered the data for the first week of June and created a new data frame, df_june.

领英推荐

13 Process Mining Challenges and 95 Best Practices

Fluxicon 7 个月前

Redefining Competitive Intelligence Through Data…

Hoick 10 个月前

Empowering Miners with Business Intelligence

Bow River Solutions Inc. 1 年前

The focus was particularly on the first week of June, where I filtered the data to hone in on the variables of interest: date, % Iron Concentrate, % Silica Concentrate, ore pulp pH, and flotation column levels.

I then created a new DataFrame, df_june_important, by selecting values from the original df_june. This allowed df_june_important to focus on the five key columns from the first week of June.

Using Python’s Seaborn library, I visualized the data. explore potential relationships between the variables.

It was surprising to see that the relationship between the key variables was weak, indicating that other factors might be influencing the concentrations.

To validate this, I generated a correlation matrix, which showed low correlation values as expected. This indicates weak relationships between the variables, suggesting that other factors might be influencing the data.

Fluctuations in % Iron Concentrate, % Silica Concentrate, Flotation Column 05 Level

Management wanted to understand how % concentration changes throughout the day, as previous insights had raised more questions. I used Seaborn to create a line plot to visualize these daily fluctuations in concentration levels.

The line plot for % Iron Concentrate showed fluctuations, particularly around 11 a.m. This was fascinating to observe, as it prompted questions about what operational changes were occurring at that time.

The line plot for % Silica Concentrate showed multiple fluctuations, particularly around 5 a.m., 11 a.m., and 6 p.m.

Similarly, a pronounced drop in the Flotation Column 05 Level around 3 p.m. raised eyebrows.

Fluctuations in Flotation Column 05 Level

Ore Pulp pH Level Histogram

In examining the ore pulp pH levels, I created histograms that showed a high frequency of values between 9.9 and 10.1, which fell within acceptable limits. This consistency was a relief to see, as it indicated a stable process.

Main Takeaways

This project reinforced several key points in my data analysis journey:

Cleaning data in Python is critical for accurate analysis; simple formatting changes can make a big difference.
Generating summary statistics quickly provides essential insights that guide further analysis.
Visualization tools like line plots and correlation matrices are invaluable in identifying trends and relationships, or the lack thereof.
The unexpected lack of correlation among key variables suggests that factors outside the data set may be influencing the results, prompting further investigation.

Conclusion and Personal Reflections

Reflecting on this project, I learned the importance of thorough data cleaning and the value of visualizing data to uncover trends. While I faced challenges, such as the initial formatting issues, I found solutions through patience and experimentation. This project has shaped my perspective on data analysis, highlighting the complexities of operational data in the mining industry.

I’m excited about the future and how I can apply these insights to improve processes in various industries.

Call To Action

I would love to hear your thoughts on this analysis! Connect with me on LinkedIn, and if you're looking to hire a data analyst or have questions about my project, let’s chat. Leave a comment below with your insights or queries!

Isabelle Pilpré

Data Analyst

3 周

Such a clear and interesting article as always! Love the fact that you explained everything so well, and that you mentioned useful tools like pairplot() and a correlation matrix - Bravo!

1 次回应

Laura S.

2 个月

Great project and write up Omhari!!!

Joseph Pascual

Data Analyst | Leveraging SQL, Tableau, and Excel to Drive Data-Driven Insights

2 个月

I enjoyed how well this flowed and how you broke down each step. The screenshots made it easy to follow alongI noticed in some charts the X-axis date/time stamps are a bit blended together and make it hard to read.(I’ve seen that LinkedIn articles will squeeze large images into a smaller aspect ratio and throw-off chart formatting!) Curious if you’ve tried rotating the x-axis value?A line of code I like to use before?plt.show() is:plt.xticks(rotation=45)# 45 being degrees of rotationI hope this helps! Overall, you’ve put together a great analysis and communicated your insights effectively

1 次回应

Erin Balatayo

Data Analyst | Business Analyst | Driving Business Insights for Growth | SQL, Tableau, Excel, Python

2 个月

Great job communicating your findings. If I was learning Python for the first time I would be easy to follow!

1 次回应

Isaac Oresanya

Data Analyst @ DCJ | Helping businesses find clarity in data | Web scraping & analytics with Python, Tableau & SQL | Open to freelance gigs

2 个月

Your insights are very well communicated. Nice work!

1 次回应

查看更多评论

要查看或添加评论，请登录

Omhari Gurung的更多文章

Clipboard Health Staffing Optimization Report

2025年2月18日

Clipboard Health Staffing Optimization Report

While working on my project about staffing in long-term care facilities, I stumbled upon something quite surprising…

14 条评论
Nepal Tourism Economic Analysis

2025年1月17日

Nepal Tourism Economic Analysis

Have you ever stood at the base of a majestic mountain and realized its significance not only as a natural wonder but…

21 条评论
Exit Insights: Data-Driven Attrition Analysis with R

2024年12月13日

Exit Insights: Data-Driven Attrition Analysis with R

When I first started my journey into data analysis, I had no idea how personal it would become. At my previous job, I…

11 条评论
Game Changers: Evaluating NBA Performance

2024年11月28日

Game Changers: Evaluating NBA Performance

As someone who admittedly knows little about basketball beyond the legendary Michael Jordan, diving into NBA statistics…

2 条评论
Understanding the Pulse of Healthcare Through Data Analysis

2024年11月19日

Understanding the Pulse of Healthcare Through Data Analysis

From a young age, I've been deeply acquainted with the healthcare system, especially since both my parents are diabetic…

9 条评论
Analyst Aren't You Hungry? Let's Dive Into DoorDash

2024年10月2日

Analyst Aren't You Hungry? Let's Dive Into DoorDash

INTRODUCTION Yesterday evening I ordered food online since my wife and I were not in the mood to cook. Honestly, I’m…

26 条评论

See all articles

Beneath the Surface A Python-Based Mining Data Analysis

Omhari Gurung

Data Analyst | SQL | Tableau | Excel | Data Visualization

Why THIS Project?

What Readers Will Gain

Key Takeaways

Dataset Details

Library Installation

Analysis Process

Were there any anomalies on June 1, 2017?

领英推荐

Fluctuations in % Iron Concentrate, % Silica Concentrate, Flotation Column 05 Level

Ore Pulp pH Level Histogram

Main Takeaways

Conclusion and Personal Reflections

Call To Action

Omhari Gurung的更多文章

社区洞察

其他会员也浏览了

How much impact does workflow visualization have on process mining analysis?

Data Mining for Iron Ore

Python extracted Iron Ore from the Earth's Core: Mining Analysis using Python

Mining for Mining Data with Python

Beyond the Surface: Engineering Predictive Analytics for Mining Mastery

Streamlining Operations: Leveraging Process Mining for Business Enhancement

Advanced Analytics in Mining Engineering

Iron Ore Mining Data Analysis Project Using Python

Market Basket Analysis - Association Rule Mining, Apriori Algorithm

Generative AI in Digital Mining : Transforming the Industry with PwC’s Framework ????

Why THIS Project?

What Readers Will Gain

Key Takeaways

Dataset Details

Library Installation

Analysis Process

Were there any anomalies on June 1, 2017?

领英推荐

Fluctuations in % Iron Concentrate, % Silica Concentrate, Flotation Column 05 Level

Ore Pulp pH Level Histogram

Main Takeaways

Conclusion and Personal Reflections

Call To Action

Omhari Gurung的更多文章

Clipboard Health Staffing Optimization Report

Nepal Tourism Economic Analysis

Exit Insights: Data-Driven Attrition Analysis with R

Game Changers: Evaluating NBA Performance

Understanding the Pulse of Healthcare Through Data Analysis

Analyst Aren't You Hungry? Let's Dive Into DoorDash

社区洞察

其他会员也浏览了

How much impact does workflow visualization have on process mining analysis?

Data Mining for Iron Ore

Python extracted Iron Ore from the Earth's Core: Mining Analysis using Python

Mining for Mining Data with Python

Beyond the Surface: Engineering Predictive Analytics for Mining Mastery

Streamlining Operations: Leveraging Process Mining for Business Enhancement

Advanced Analytics in Mining Engineering

Iron Ore Mining Data Analysis Project Using Python

Market Basket Analysis - Association Rule Mining, Apriori Algorithm

Generative AI in Digital Mining : Transforming the Industry with PwC’s Framework ????