登录查看更多内容

Processing Plant Data with Python.

Jon Ekroth

Data Analyst ?Excel ?Tableau ?SQL ? Data ? Visualization? Problem solving ? Troubleshooting

发布日期: 2023年7月3日

In this project I have been recently “hired” as a data analyst for a manufacturing / engineering / science company. More specifically, I’ve been hired as a data analyst for a mining company called Metals R' Us & have been given data from their froth flotation processing plant. The main goal of this analysis is to find a possible issue that occurred on June 1, 2017. The plant manager wants an investigation to see if there is a problem that needs to be addressed. First, let's get an idea of what the froth process encompasses.

Froth flotation Process ?WIKI Explanation

The froth flotation process is widely used in mineral processing. This process is used to separate out unwanted products from dirt by using air or nitrogen in large water filled tanks to float desired materials, or concentrate as seen in Diagram 1. The pulp is a mixture of water and ore that is brought in to be processed. This process is important because the extraction of desired metals from larger amounts of lower grade materials is made possible. This process is also used in waste water treatment plants where water is separated from solids or oils.

The Data

The data used for this analysis is real, taken from Kaggle and used to predict quality in the froth flotation process. This data set covers the months of March, 2017 to September, 2017. Column readings are a bit uneven as some results are sampled every 20 seconds, and others sampled every hour. There are 24 columns and 737,453 rows in this dataset.

Three required libraries will be need to be loaded into Python. Pandas, Seaborn and Matplotlib. Pandas are being used for data upload and manipulation while Seaborn and Matplotlib are being used for data visualization.

In order to get an idea of what the dataset contains df.head() and df.shape are used to preview the first five rows of data and the number of rows and columns respectively.

The dates used in this dataset are in text form so they had to be converted to a date time column so they can be aggregated. The Python code used for this task is: df['date'] = pd.to datetime(df['date']). The data dictionary shown below describes the columns used in the dataset.

Another issue with the dataset is that commas were used for the numerical data. I updated the cells to contain periods instead of commas so the numbers would be formatted the same. I code I used to address this is df = pd.read_csv('MiningProcess_Flotation_Plant_Database.csv',decimal=",")

Getting the Results

In order get a handle on some statistics of the data I used df.describe(), to find the mean, max, min and other information on the different columns.

I’m going to filter my data for the month of June by creating a new dataframe called df_june. This is done is done speed up any searches that need to be made. I only want to concentrate on a few columns so I will create a new variable called important_cols. I then created a new dataframe,?df_june_important,?and set it equal to the older dataframe (df_june) with the columns in?important_cols.

The result of the code is shown below for June 1st.

领英推荐

Why Should Every Geologist Learn Python? Gain a…

Petroleum Engineers Association 3 个月前

Predict Reservoir Performance with Python - Gain a…

Petroleum Engineers Association 2 个月前

Uncover Hidden Opportunities with Well Clustering in…

Petroleum Engineers Association 2 个月前

Next, I called on the Seaborn library to simultaneously compare % Iron Concentrate, % Silica Concentrate, Ore Pulp pH and Flotation column 05 level. There were some suspicions that column5 may be having issues.

There doesn’t appear to be any correlation between these columns. Even though this may be true, it is still valuable information to keep on hand for future reference. Just to be sure I am seeing the above data correctly, I can run the .corr() command on the df_june_important dataframe. It is now easier to read that the correlation values are very low.

It can also be useful to view the same information in a line chart. Seaborn will be used again to create this graph. Different graphs had to be used because the unites of measure are too different.

Conclusion

There does not appear to be anything troubling happening on June 1 as all readings are running between normal ranges. The one outlier that needs to be researched further is the large drop for Flotation Column 05 Level, shown above, that dropped to a reading of 167.36. Not all data analysis is going to produce obvious results. Even if that is the case, the data can be useful for future comparisons. Python and its libraries are able to produce charts fairly easily to show the information you are trying to analyze and can keep businesses running smoothly.

Thank you for taking the time to read my analysis. Feel free to reach out if you have any questions or would like to talk analytics.

References:

Picture 1 https://www.thermofisher.com/blog/mining/how-to-improve-mining-and-mineral-operations-heres-a-guide/

Diagram 1 https://commons.wikimedia.org/wiki/File:FlCell.PNG

Picture 2 https://commons.wikimedia.org/wiki/File:Flotation_cell.jpg

要查看或添加评论，请登录

Jon Ekroth的更多文章

??Path to the NBA Playoffs??

2024年7月3日

??Path to the NBA Playoffs??

INTRODUCTION As I watched the Boston Celtics compete in the NBA Finals this week, I wondered what stats had the most…

8 条评论
?? March Madness!! ??

2024年3月23日

?? March Madness!! ??

March is one of my favorite sports times of the year because it brings the Men's Final Four Basketball Championship or…

2 条评论
Looking at Excel analysis in a whole new way.

2023年8月30日

Looking at Excel analysis in a whole new way.

??? I have spent many hours using SQL, Tableau and R lately to work with data analysis. ??? Recently, I have gone back…

8 条评论
The World Bank Analysis Using SQL

2023年8月6日

The World Bank Analysis Using SQL

Data analysis of The Wold Banks IDA statement of credits and grants. Background For this report I was “hired” by The…

8 条评论
Analyzing Employee Attrition using R

2023年7月28日

Analyzing Employee Attrition using R

INTRODUCTION Human Resources departments have a constant battle trying to retain long-term employees. Longer tenured…

8 条评论
Book Recommendation

2023年7月26日

Book Recommendation

I recently finished reading storytelling with data by Cole Nussbaumer Knaflic. I really enjoyed learning about how to…

5 条评论
Your First Job

2023年6月28日

Your First Job

What did you learn about yourself from the experiences of your first job? Looking back, I can see where I developed…

3 条评论
Interview: Data Analyst Report Utah Jazz

2023年6月17日

Interview: Data Analyst Report Utah Jazz

In this project I am “interviewing” with the Utah Jazz for a Data Analyst role. I will be using Tableau public to…

16 条评论
Diabetes Patient Analysis

2023年6月3日

Diabetes Patient Analysis

INTRODUCTION For this SQL project I have been just been “hired” as a health care data analyst and management needs some…
Getting to know You.

2023年5月30日

Getting to know You.

I’ve always had a difficult time figuring out what I should be doing for a living. Trial and error is one way to find a…

2 条评论

See all articles

Processing Plant Data with Python.

Jon Ekroth

Data Analyst ?Excel ?Tableau ?SQL ? Data ? Visualization? Problem solving ? Troubleshooting

Froth flotation Process ?WIKI Explanation

The Data

Getting the Results

领英推荐

Conclusion

Jon Ekroth的更多文章

社区洞察

其他会员也浏览了

Python for Geologists

Understanding Stress-Strain Behaviour: From Theory to Real Data Visualization (with Python code)

Practical Data Analysis and Machine Learning Using Python for Petroleum Engineer Workshop

Analyzing Logging Data for Natural Hydrogen Exploration Using Python - A Hands-On Example

PVT Correlations with Python

Oil Reservoir Material Balance with Python

Enhancing Well Placement Efficiency Using Python and Its Libraries

Python in Well-Path Optimization: Enhancing Efficiency and Accuracy

Harnessing Python and SQL for Efficient Mine Planning: A Practical Guide

Predict Reservoir Performance with Python - Gain a Competitive Edge!

Froth flotation Process ?WIKI Explanation

The Data

Getting the Results

领英推荐

Conclusion

Jon Ekroth的更多文章

??Path to the NBA Playoffs??

?? March Madness!! ??

Looking at Excel analysis in a whole new way.

The World Bank Analysis Using SQL

Analyzing Employee Attrition using R

Book Recommendation

Your First Job

Interview: Data Analyst Report Utah Jazz

Diabetes Patient Analysis

Getting to know You.

社区洞察

其他会员也浏览了

Python for Geologists

Understanding Stress-Strain Behaviour: From Theory to Real Data Visualization (with Python code)

Practical Data Analysis and Machine Learning Using Python for Petroleum Engineer Workshop

Analyzing Logging Data for Natural Hydrogen Exploration Using Python - A Hands-On Example

PVT Correlations with Python

Oil Reservoir Material Balance with Python

Enhancing Well Placement Efficiency Using Python and Its Libraries

Python in Well-Path Optimization: Enhancing Efficiency and Accuracy

Harnessing Python and SQL for Efficient Mine Planning: A Practical Guide

Predict Reservoir Performance with Python - Gain a Competitive Edge!