登录查看更多内容

A Complete Guide to ICE Plots

Vikram Dev

AI Engineer | IIT Delhi | 99.93%ile CAT'23

发布日期: 2024年9月20日

ICE (Individual Conditional Expectation) Plots are used to visualize the effect of a single feature on the predicted outcome of a machine learning model, holding all other features constant. They provide a way to understand how changes in a single feature affect the model’s predictions for individual instances.

Detailed Mathematics Behind ICE (Individual Conditional Expectation) Plots

Model Setup and Context

Formulation

Step-by-Step Construction of ICE Plot

Interpretation of Curves

Each individual ICE curve shows how the model responds to changes in x_j for one specific observation. Comparing ICE curves can reveal patterns such as:

Monotonic Trends: If most ICE curves show a consistent increasing or decreasing trend, the model may have a global relationship with x_j that is monotonic.
Heterogeneous Responses: If ICE curves vary widely, the model may be capturing interactions or non-linearities that depend on other features in the dataset.
Flat Curves: If the ICE curve for an observation is flat, it means that the model prediction for that observation is not sensitive to changes in x_j.

Link to Partial Dependence

The Partial Dependence Plot (PDP) can be considered as the average of all ICE curves

PDP VS ICE Plot

ICE plots show the prediction trajectories for each observation individually, helping to reveal heterogeneous effects (e.g., interactions or non-linearities in the model).
PDP plots show the average trend, giving a smoothed picture of how the feature x_j affects predictions on average across the dataset.

Imagine a model that predicts car prices based on features like the car’s horsepower (HP), age, and brand.

ICE Plot: For the feature “horsepower,” the ICE plot would display a curve for each individual car in the dataset, showing how the predicted price changes as horsepower varies from a low to a high value. Each curve represents a different car, and these curves might show different patterns (e.g., some cars’ prices might increase sharply with horsepower, while others might have a more gradual increase or even no change).
PDP Plot: The PDP for “horsepower” would be the average of all those individual curves, resulting in a single, smooth curve. This curve would show the overall relationship between horsepower and car price, on average across all cars, without revealing how individual predictions behave.

Imagine an ICE plot where 10 individual curves are shown. Some of these curves might increase sharply, others might stay flat, and a few might decrease. The PDP is the curve you get if you take the average of all these individual curves. The PDP might show a general increase, but it would smooth out the variation seen in the individual ICE curves.

领英推荐

Why are Confidence Regions Elliptic? Simple Explanation

Vincent Granville 2 年前

Projections with Ranges

Rainer Grimm 2 年前

Five Extensions of the General Linear Model

Karen Grace-Martin 5 个月前

Implementation

To implement Individual Conditional Expectation (ICE) plots in Python, you can use libraries like scikit-learn, pandas, and matplotlib. Here's an example of how we can generate ICE plots.

Install required libraries

pip install scikit-learn matplotlib pandas pycebox

Code

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeRegressor
from pycebox.ice import ice, ice_plot

# Step 1: Load California Housing dataset
data = fetch_california_housing()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

# Step 2: Train-test split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Step 3: Fit a model (e.g., DecisionTreeRegressor)
model = DecisionTreeRegressor(random_state=42)
model.fit(X_train, y_train)

# Step 4: Define a prediction function
def predict_fn(X):
    return model.predict(X)

# Step 5: Generate ICE data for a specific feature (e.g., 'MedInc' - Median Income)
feature = 'MedInc'
ice_data = ice(X_test, feature, predict_fn, num_grid_points=50)

# Randomly sample 100 of the ICE curves to plot
sampled_ice_data = ice_data.iloc[:, :100]

# Step 7: Plot ICE curves
fig, ax = plt.subplots(figsize=(10, 6))
ice_plot(sampled_ice_data, frac_to_plot=1.0, ax=ax) 
plt.title(f"ICE Plot for feature '{feature}'")
plt.xlabel(feature)
plt.ylabel('Predicted House Value')

# Adjust line width for all lines in the plot
for line in ax.get_lines():
    line.set_linewidth(0.5)  # Set line width to 0.5 for finer lines

plt.show()

Interpretation of the plot

Each line in the ICE plot represents the prediction for one instance as the selected feature value is varied (e.g., from low to high values).

Axes Understanding:

The x-axis typically represents the values of the feature being analyzed (e.g., ‘MedInc’).
The y-axis shows the predicted outcome (e.g., predicted house value).

Slope Interpretation:

Upward Slope: Indicates a positive effect; as the feature increases, the prediction increases.
Downward Slope: Indicates a negative effect; as the feature increases, the prediction decreases.
Flat Lines: Suggest minimal or no effect of the feature on the predictions for that instance.

Variability Across Instances:

Divergence: If lines spread apart significantly, it indicates that different instances respond differently to changes in the feature. This suggests heterogeneity in how the feature affects predictions.
Convergence: If lines are close together, it suggests a more uniform effect of the feature across instances.

Identifying Non-Linear Effects:

If lines start to curve or change direction, it indicates non-linear relationships or potential interactions with other features. This means the effect of the feature is not constant across its range.

In conclusion, ICE plots significantly enhance the interpretability of complex models, bridging the gap between model predictions and real-world implications. They allow practitioners to understand and explain how specific features impact outcomes, leading to more reliable and transparent machine learning applications. Whether used for model evaluation, feature analysis, or communicating results, ICE plots are an essential tool in any machine learning toolkit.

Khatijah Man

Penilai orang yang berada disisi

6 个月

Nasihat yang hebat

要查看或添加评论，请登录

Vikram Dev的更多文章

A Theoretical Guide to SHAP Values

2024年9月20日

A Theoretical Guide to SHAP Values

SHAP (SHapley Additive exPlanations) values are a method from game theory used to explain the output of machine…
Machine Learning Implementation of SHAP

2024年8月3日

Machine Learning Implementation of SHAP

SHAP (SHapley Additive exPlanations) values are a game theory-based method for explaining the output of machine…

1 条评论

A Complete Guide to ICE Plots

Vikram Dev

AI Engineer | IIT Delhi | 99.93%ile CAT'23

Detailed Mathematics Behind ICE (Individual Conditional Expectation) Plots

Model Setup and Context

Formulation

Step-by-Step Construction of ICE Plot

Interpretation of Curves

Link to Partial Dependence

PDP VS ICE Plot

领英推荐

Implementation

Interpretation of the plot

Vikram Dev的更多文章

社区洞察

其他会员也浏览了

Copulas explained

Kalman Filter: The first dive

What is Multicollinearity? A Visual Description

Counting Too Many Zeros? Try Zero- Inflated Poisson Models

Key Topics for Random Processes & Statistics and Probability

Time Series Episode 0: Familiarize with ARIMA and its parameters

Look-ahead bias

Intuitive and Visual Explanation on the differences between L1 and L2 regularization

Multi-Collinearity

How to check Model performance

Detailed Mathematics Behind ICE (Individual Conditional Expectation) Plots

Model Setup and Context

Formulation

Step-by-Step Construction of ICE Plot

Interpretation of Curves

Link to Partial Dependence

PDP VS ICE Plot

领英推荐

Implementation

Interpretation of the plot

Vikram Dev的更多文章

A Theoretical Guide to SHAP Values

Machine Learning Implementation of SHAP

社区洞察

其他会员也浏览了

Copulas explained

Kalman Filter: The first dive

What is Multicollinearity? A Visual Description

Counting Too Many Zeros? Try Zero- Inflated Poisson Models

Key Topics for Random Processes & Statistics and Probability

Time Series Episode 0: Familiarize with ARIMA and its parameters

Look-ahead bias

Intuitive and Visual Explanation on the differences between L1 and L2 regularization

Multi-Collinearity

How to check Model performance