登录查看更多内容

Feature Importance and Feature Selection - Framework

Mukesh Manral????

??Data Science Specialist - AI + Education

发布日期: 2023年2月25日

+ 关注

?? Feature Importance and Feature Selection

????Feature selection is itself useful, but it mostly acts as a filter, muting out features that aren’t useful in addition to your existing features.

????Feature selection is the process of selecting a subset of relevant features for use in model construction.

------------------------------

?? Correlation-based Feature Selection (CFS): CFS selects features based on their correlation with the target variable and their correlation with each other. It measures the subset's ability to predict the target variable while also avoiding the inclusion of redundant features.

?? Recursive Feature Elimination (RFE): RFE is a popular feature selection method that recursively eliminates features based on their importance to the model's performance. It uses a model to evaluate feature importance and removes the least important feature iteratively until the desired number of features is reached.

?? Tree-based Feature Importance: Tree-based feature importance is a widely used method for measuring feature importance in decision tree-based algorithms. It calculates the importance of each feature by how much it reduces the impurity or entropy of the model.

?? Mutual Information-based Feature Selection: Mutual information-based feature selection evaluates the mutual information between each feature and the target variable. It selects features with high mutual information, which implies that they contain information that is relevant to predicting the target variable.

?? Lasso Regularization: Lasso regularization is a linear model regularization technique that shrinks the coefficients of less important features to zero. It selects features by setting their coefficients to zero, resulting in a sparse model with only the most important features.

?? Principal Component Analysis (PCA): PCA is a dimensionality reduction technique that transforms high-dimensional data into a lower-dimensional space while preserving most of the variance in the original data. It selects features by identifying the principal components that explain most of the variability in the data.

?? Mutual information: This technique involves measuring the mutual information between each feature and the target variable, and selecting the features with the highest mutual information.

Will share more later, follow for more details.

You can also follow this framework:

?? Feature Importance Analysis: Analyse the importance of each feature in your dataset using one or more of the following methods:

??Correlation Analysis

??Univariate Feature Selection

?? Model-Based Feature Selection

?? Recursive Feature Elimination

?? Feature Selection: Once you have identified the most important features, you can use one or more of the following techniques to select the final set of features:

??Filter Methods

??Wrapper Methods

??Embedded Methods

Feature Importance and Feature Selection - Framework

Mukesh Manral????

??Data Science Specialist - AI + Education

?? Feature Importance and Feature Selection

Manralai-AiConsulting+Edu

1,232 位关注者

更多精彩文章

社区洞察

其他会员也浏览了

Get Ready for the Software 2.0 Era

The Building Blocks Of A Data Driven Culture

Understanding Big O Notation - A Must for Developers

Building an LLM Visualization Tool: Challenges, Learnings, and the Road Ahead (Github Repo)

Data-Driven Profit Projection: Exploring Predictive Margin Analysis using Random Forests

Dataset QA — how to automatically review a new set of images?

Logistic regression made simple

Constructing a Robust Sentiment Analysis Model with Custom Text Preprocessing using NLTK

Semantic Data Modeling For Fun and Profit

"On-Time Delivery (OTD) Risk Predictions Using Bayesian Model"

?? Feature Importance and Feature Selection

Manralai-AiConsulting+Edu

1,232 位关注者

Master Categorical Data Encoding Methods in 60 seconds??

2024年7月30日

Attention

2024年2月5日

SQL in a Nutshell: A Hilarious Breakdown by Mukesh Manral???? #sql?@Manralai

2023年12月16日

Navigating the Evolution of NLP: A Comprehensive Deep Dive into Cutting-Edge Models Beyond 2013 ?? #nlp #deeplearning

2023年11月29日

??ChatGpt : Explaining Math's behind ChatGpt without getting into math's

2023年5月15日

Guide to Commonly Used Deep Learning Kernel_Initializers in Real-World Projects

2023年3月27日

Why did the computer vision engineer choose YOLO?

2023年3月14日

Hypothesis Testing - Framework

2023年3月8日

Handling ImBalanced Classes-Framework

2023年2月25日

Top-111 Data Science Interview Questions & Detailed Answers

2023年2月24日

社区洞察

其他会员也浏览了

Get Ready for the Software 2.0 Era

The Building Blocks Of A Data Driven Culture

Understanding Big O Notation - A Must for Developers

Building an LLM Visualization Tool: Challenges, Learnings, and the Road Ahead (Github Repo)

Data-Driven Profit Projection: Exploring Predictive Margin Analysis using Random Forests

Dataset QA — how to automatically review a new set of images?

Logistic regression made simple

Constructing a Robust Sentiment Analysis Model with Custom Text Preprocessing using NLTK

Semantic Data Modeling For Fun and Profit

"On-Time Delivery (OTD) Risk Predictions Using Bayesian Model"