登录查看更多内容

Wine Quality Dataset

Robert Paul

Data Science Enthusiast with a Passion for Machine Learning

发布日期: 2022年1月20日

+ 关注

(Courtesy: M Yasser H of Kaggle (https://www.kaggle.com/yasserh))

Description:

This datasets is related to red variants of the Portuguese "Vinho Verde" wine.The dataset describes the amount of various chemicals present in wine and their effect on it's quality. The datasets can be viewed as classification or regression tasks. The classes are ordered and not balanced (e.g. there are much more normal wines than excellent or poor ones).Your task is to predict the quality of wine using the given data.

A simple yet challenging project, to anticipate the quality of wine.

The complexity arises due to the fact that the dataset has fewer samples, & is highly imbalanced.

Can you overcome these obstacles & build a good predictive model to classify them?

This data frame contains the following columns:

Input variables (based on physicochemical tests):

1 - fixed acidity

2 - volatile acidity

3 - citric acid

4 - residual sugar

5 - chlorides

6 - free sulfur dioxide

7 - total sulfur dioxide

8 - density

9 - pH

领英推荐

Feeding the Future: How AI is changing the way we…

Praelexis AI 1 年前

From Data to Delivery: How AI is Revolutionizing the…

Radhakrishna Foodland Pvt. Ltd. 1 年前

Revolutionising Food Processing with AI

Strategic Allies Limited 5 个月前

10 - sulphates

11 - alcohol

Output variable (based on sensory data):

12 - quality (score between 0 and 10)

Acknowledgements:

This dataset is also available from Kaggle & UCI machine learning repository,?https://archive.ics.uci.edu/ml/datasets/wine+quality.

Objective:

Understand the Dataset & cleanup (if required).
Build classification models to predict the wine quality.
Also fine-tune the hyper parameters & compare the evaluation metrics of various classification algorithms.

Observations:

In order for me to understand this dataset, I had to take a quick glimpse of the dataset by scrolling through the first couple of rows in the table. I ran a jittered scatter plot as well as a correlation circle plot for that dataset below.

Once that was taken care of, I experimented with some models that would give me the best fit. Those models are listed below:

k-Nearest Neighbors (KNN)
Partial Least Squares Regression (PLS)
Gaussian Fit Linear Regression (GAUSSIAN)
Original Linear Model (LM)
Generalized Linear Model (GLM)
Elasticnet (ENET)
Cubist
Random Forest (RF)

Conclusion:

Overall, based on the table above, we find out that the RF model has the least MAE & RMSE, as well as the greatest R-Squared metric. For further detail on this model, please visit my website at www.github.com/pc1991/Wine. I am looking forward to seeing you there. Thank you very much for reading. Take care.

要查看或添加评论，请登录

Robert Paul的更多文章

The Blowout Brush Bananza

2023年1月8日

The Blowout Brush Bananza

Ladies and gentlemen, I anticipate what you are probably thinking when you start reading this article, "Why is…
Playing Around With Deep Learning: The Iceland Version

2022年12月18日

Playing Around With Deep Learning: The Iceland Version

I hope everyone is having a happy weekend so far preparing for Football Sunday: The FIFA World Cup Final and Week 15 of…
Picking A Model To Predict Future House Prices in the US

2022年11月28日

Picking A Model To Predict Future House Prices in the US

Ladies and gentlemen, I have been well aware on the alleged great migration within the United States of America. I've…
Can FTX Recover?

2022年11月24日

Can FTX Recover?

Ladies and gentlemen, I took the time to run some machine learning models of the dataset of the FTT coin from the last…
Digging deep into the collapse of FTX

2022年11月23日

Digging deep into the collapse of FTX

Ladies and gentlemen, I want to write a quick excerpt based on the findings that I have found numbers wise. Below is a…
Ego + Hubris = Denial = Arrogance

2022年11月22日

Ego + Hubris = Denial = Arrogance

Ladies and gentlemen, if you were not aware of the cryptocurrency news by now, FTX, the company that handles and…
Sample Housing Market Problem

2022年1月16日

Sample Housing Market Problem

(Courtesy: M Yasser H of Kaggle (https://www.kaggle.
AMC Stock Data & Its History

2021年11月11日

AMC Stock Data & Its History

(Courtesy: Arpit Verma of Kaggle) https://www.kaggle.
Online Shoppers Purchasing Intention Dataset

2021年11月6日

Online Shoppers Purchasing Intention Dataset

(Courtesy: Akash Patel of Kaggle) Ladies and gentlemen, I did a thorough analysis on a dataset of multiple vectors…

See all articles

Wine Quality Dataset

Robert Paul

Data Science Enthusiast with a Passion for Machine Learning

Description:

领英推荐

Acknowledgements:

Objective:

Observations:

Conclusion:

Robert Paul的更多文章

社区洞察

其他会员也浏览了

How the Guinness Brewery Invented the Most Important Statistical Method in Science

SpinMagIC: 'EPR on a chip' ensures quality of olive oil and beer

AI in the Wine Industry: An Interview with ChatGPT

?? AI for Grapes: Post-Harvest Quality Assurance ??

Top-Talent, Purpose-Driven, and Willing Food Scientists and Engineers: Our Time is Now

Pudding, focus, and data work

Science Research of Food Sensation: Challenges and Possibilities

Why the food industry needs AI

??Ai and Bourbon, Remus, 100 Proof Old Forrester, You Must Be Over 25 To Enter, Great Mixed Drinks & Rye Spice Isn't What We Thought...????

Is This Idea Novel? An AI Based Scoring System

Description:

领英推荐

Acknowledgements:

Objective:

Observations:

Conclusion:

Robert Paul的更多文章

The Blowout Brush Bananza

Playing Around With Deep Learning: The Iceland Version

Picking A Model To Predict Future House Prices in the US

Can FTX Recover?

Digging deep into the collapse of FTX

Ego + Hubris = Denial = Arrogance

Sample Housing Market Problem

AMC Stock Data & Its History

Online Shoppers Purchasing Intention Dataset

社区洞察

其他会员也浏览了

How the Guinness Brewery Invented the Most Important Statistical Method in Science

SpinMagIC: 'EPR on a chip' ensures quality of olive oil and beer

AI in the Wine Industry: An Interview with ChatGPT

?? AI for Grapes: Post-Harvest Quality Assurance ??

Top-Talent, Purpose-Driven, and Willing Food Scientists and Engineers: Our Time is Now

Pudding, focus, and data work

Science Research of Food Sensation: Challenges and Possibilities

Why the food industry needs AI

??Ai and Bourbon, Remus, 100 Proof Old Forrester, You Must Be Over 25 To Enter, Great Mixed Drinks & Rye Spice Isn't What We Thought...????

Is This Idea Novel? An AI Based Scoring System