登录查看更多内容

Unearthing Insights: Python Data Analysis in the Mining Industry

Simran Pathak

Senior Operations Associate at Athelas+Commure | Healthcare

发布日期: 2023年4月11日

Introduction:

This project involves analyzing real-world data from a flotation plant at Metals R' Us, a mining company that specializes in extracting iron from impurities. The main focus of the analysis is the "% Iron Concentrate" variable, which indicates the purity of the extracted iron. By using Python to analyze the data, this project aims to uncover hidden insights that can improve the extraction process, reduce costs, and increase efficiency. The project showcases the power of data analysis in optimizing traditional processes and highlights the importance of skilled data analysts in unlocking valuable insights from complex data sets.

The dataset for this project can be found here.

The following were the business questions I worked on to answer.

Was there an unnatural occurrence in the sample collection process on 6.1.2017?
Is there any correlation/relationship between the variables we found?
Is there a change in the Amina Flow throughout the day? Are there other variables that we can present on?
What is the difference between Iron Ore before floatation versus after?
Is there a month that had greater concentration in Iron? What about Silica?
Does Starch Flow affect the Concentration of Iron?

Key Insights:

According to the studied data, nothing unusual happened on June 6, 2017.
There is no association between the variables we pulled and looked at, according to the charts the seaborn library gave.
The mineral change rates do differ from hourly each day, according to the line charts for Amina Flow, Iron Concentrate, Silica Concentrate, Ore Pulp PH, and Flotation Column 05 Level.
The best results came from the ore treated between May 13 and June 15.
The average iron content after purification is roughly 65. Silica, however, ranges from 2-3.
A starch flow of 4,500 or more reveals more impurities, producing iron of greater purity.

The Analysis:

Was there an unnatural occurrence in the sample collection process on 6.1.2017?

Let's first start by getting to know how long this data set spans for by returning the earliest and latest date of the data set. Next, let's filter the rows with a boolean mask & create a new dataframe?df_june:

df_june = df[(df['date'] >?"2017-05-31 23:59:59") & (df['date'] <?"2017-06-02")].reset_index(drop=True)

Now that helps our rows, but we still have all the columns.

We'll create a variable that is a list of all the important columns we want to focus on. We will call that variable?important_cols. Once this is done we create a new dataframe called?df_june_important?and set it equal to the older dataframe?df_june?in the column of important_cols.

2. Is there any correlation/relationship between the variables we found?

We just utilize Seaborn as sns and request the pair plot using the data frame as an argument to respond to this question.

There does not appear to be any correlation or association between this information after looking at these plot maps.This can be confirmed with a correlation matrix & noticing all the correlation values are low.

3. Is there a change in the Amina Flow throughout the day? Are there other variables that we can present on?

We can see from the above line chart there is high levels of change rate throughout the date for the Amina flow. Since the line chart was very helpful, we want to see other variable changes during the exact timeframe as the Amina Flow.

领英推荐

Confusion Matrix: Evaluation Measures for…

Angad Gupta ,MIEEE, BITS-Pilani 4 年前

Digging Deep: Uncovering Patterns in Iron Ore Mining…

Jyoti Bhatnagar 1 年前

Python extracted Iron Ore from the Earth's Core:…

Aksha Hrudhai K 1 年前

Based on the information obtained via python, we can see that Iron concentrate, silica concentrate, Ore Pulp Ph and the floatation column level all have changes throughout the day.

4. What is the difference between Iron Ore before floatation versus after?

As it is fed into the processing device, iron is first measured. The data reads the same number multiple times because the Iron is tested hourly.

Even though this is the information I require, it lacks sufficient context. I created a bar graph using the date and the difference to better illustrate where the Iron was at the beginning and the end.

The "Difference" in the graph, also known as the differences between the initial ore (% Iron Ore) and the end ore (% Iron Concentration), can be seen using the bar chart. From May 13 and June 15, there was barely any difference. making me?think that the specific ore was of higher quality. It would be wise to go back and look where this mineral was discovered.

5. Is there a month that had greater concentration in Iron? What about Silica?

I can see exactly the time frames I have available. I use the command listed below to discover that the data is from 10th April 2017 to 9th September 2017.

I now have the time period to investigate the iron and silica concentrations. I further filter the data and concentrate on the three primary columns where my data is located. Date, percentages of iron and silica concentrations.

I utilize this data to instruct Seaborn to produce visualizations. The "% Iron Concentration" amounts over the time period the dataset is in are shown in the first graph. I can see that the iron is, on average, 65 at the end of the purifying procedure.

The "% Silica Concentration" at the conclusion of the iron's purification procedure is depicted in the second graph. During this time, the Silica is between 2-3.

6.?Does Starch Flow affect the Concentration of Iron?

Floatation is a process that enables the separation of valuable minerals from waste rock. The minerals will float at the surface and rise above in gas bubbles. Using depressants is yet another method of mineral separation. Starch is a frequent depressant. 2020. Starch will be added after the iron has been magnetically separated to assist keep the iron clean. As a result, since silica is regarded as an impurity in this dataset, we must examine it.

I concentrated on the starch flow in the graph below since it might have had an impact on the final percentage of silica concentrate. It seems that if the starch flow is 4,500 or greater, the percentage of silica concentrate is also higher, indicating that more contaminants are being discovered.

Conclusion:

I enjoyed working with this project as it didn’t had a familiar dataset and had to do some research about this topic . I am trying to expand my data skills so, if you have any suggestions or feedback , feel free to reach out to me. I am looking for a career in data so if you have any opportunities , I'd be open to explore.

Feel free to connect with me on LinkedIn and be on the look out for future data projects from me!

Deekshith Bommisetty

SDE Amazon | MS Computer Science | Syracuse University

1 年

Interesting work !

Kinzel Jain

Business Analyst

1 年

Great work!

查看更多评论

要查看或添加评论，请登录

Simran Pathak的更多文章

The Future of Healthcare Analytics: A SQL-Based Approach

2023年4月4日

The Future of Healthcare Analytics: A SQL-Based Approach

Healthcare is an important component of society, and with an ever-increasing volume of data available, data analysis…

1 条评论
DoorDash Market Growth Analysis

2023年1月6日

DoorDash Market Growth Analysis

Moving to the United States as an international student was not a straight forward endeavor. As an international…

Unearthing Insights: Python Data Analysis in the Mining Industry

Simran Pathak

Senior Operations Associate at Athelas+Commure | Healthcare

Introduction:

Key Insights:

The Analysis:

领英推荐

Conclusion:

Simran Pathak的更多文章

社区洞察

其他会员也浏览了

Python extracted Iron Ore from the Earth's Core: Mining Analysis using Python

Iron Ore Mining Data Analysis Project Using Python

Which Big Data, Data Mining, and Data Science Tools go together?

The Art and Science of Data Mining in the Aviation Industry

Data - Mining; An example using Random Forest on Prediction of Biological Properties of Molecules from Chemical Structure.

Python project: This is Not Another Iron Mining Analysis

Iron Ore Mining Data Analysis Project Using Python

Iron Mining Analysis with Python

Python Project: Analyzing Mining Data

Machine Learning 9: 'Sequential Rule Mining'

Introduction:

Key Insights:

The Analysis:

领英推荐

Conclusion:

Simran Pathak的更多文章

The Future of Healthcare Analytics: A SQL-Based Approach

DoorDash Market Growth Analysis

社区洞察

其他会员也浏览了

Python extracted Iron Ore from the Earth's Core: Mining Analysis using Python

Iron Ore Mining Data Analysis Project Using Python

Which Big Data, Data Mining, and Data Science Tools go together?

The Art and Science of Data Mining in the Aviation Industry

Data - Mining; An example using Random Forest on Prediction of Biological Properties of Molecules from Chemical Structure.

Python project: This is Not Another Iron Mining Analysis

Iron Ore Mining Data Analysis Project Using Python

Iron Mining Analysis with Python

Python Project: Analyzing Mining Data

Machine Learning 9: 'Sequential Rule Mining'