Innovative approach to AI project delivery with Generative AI
Introduction
Traditional AI is really effective to address specific use cases, supported by data scientists team and domain specialists who can identify together target variable and features to train narrow ML models that can answer with high precision to specific problem. For example, if you are a company and you need to manage a set of assets, you can need a set of machine learning algorithm to support your maintenance operations teams with predictive maintenance solutions. Also, if you are an Energy and utilities company you can need of a set of machine learning algorithms to identify anomaly in your consumption collected data (e.g., water consumption or electric consumption or gas consumption data). Again, if you are a Telco company, you can need to predict network traffic in each country for next period. This approach usually require to setting up a project pipeline, staffing a team including project manager, data architect, data scientists, business domain experts, with an elapsed of months, if you need an high accuracy and an industrialized pipeline.
Generative AI, from the other side, is able to understand pattern in data and to show insights from data, giving also useful highlights, without need to train a traditional AI model.
What about the explainability?
Explainability topic here needs to be discussed. This topic is a "background noise" we have from the beginning of "traditional AI era" injection in business domains. Customer's first questions has already been: "how can I explain my prediction to business expert?". If the traditional rule-based systems had the pain points to be difficult to use for complex scenarios, they have the pro to be simply explainable; their output is an effect of a set of rules, combined with a set of "and", and "or" conditions. The "traditional AI" algorithms, from the other side, are able to address more complex scenarios but they are more complex to explain, compared to rule based systems. Today a set of framework exist for the traditional AI models that can support model insights with an explaination. Generative AI happened, and it can extract insights from data with a simple prompt but:
Saying that, the status is:
How can generative AI support in this kind of scenario?
Approach
Generative AI has the potential to revolutionize the way data scientists work by automating various steps in the traditional AI model development process.
Generative AI Impact on Various Steps of traditional AI models implementation:
Saying that, a platform able to support business experts and data scientists to develop AI models to enhance business decisions taking processes can be a solution.
Demo
With this simple demonstration, I'll give you a scenario where the use of a platform to drive data scientists and business experts in developing an AI algorithm can be really impactive. All the demonstration are based on IBM watsonx.ai, IBM platform part of the IBM watsonx platform that brings together new generative AI capabilities, powered by foundation models and traditional machine learning into a powerful studio spanning the AI lifecycle.
Consider the scenario when you are an energy and utilities company with the need to predict water consumption for your customers. I'll demonstrate how generative AI, can support for the analysis and implementation steps with a simplified example. In particular, Generative AI will identify the variable to use, from your dataset, to train your model, and will train a regression model to predict water consumption.
领英推荐
Synthetic data generation
First of all, I'll generate you a sample dataset leveraging genAI capabilities, using the prompt below:
ID_COUNTER | TIME | CONSUMPTION | WEATHER | AVERAGE_LAST_WEEK | NUMBER_OF_FAMILY_MEMBER
------------------------------------------------------------------------------------------------------
1 | 2021-10-01 08:00:00 | 150 | Sunny | 130 | 4
2 | 2021-10-01 09:00:00 | 120 | Cloudy | 115 | 3
3 | 2021-10-01 10:00:00 | 200 | Rainy | 190 | 5
Now that we have our dataset, we can go ahead experimenting the analysis and implementation step I described in "Approach" section.
Analysis and implementation
Having your dataset schema, you can now ask to the model to detect best set of features to use in order to train your model maximizing the accuracy. For example, you can consider following prompt:
Act as a data scientist. Write me a code to identify best variable to use to predict water consumption.
Next code is an example of autogenerated code using sklearn Python library:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.feature_selection import RFE
from sklearn.metrics import mean_squared_error
# Load your dataset (replace 'your_dataset.csv' with the name of your CSV file)
data = pd.read_csv('your_dataset.csv')
# List of potential predictor variables
variables = ['WEATHER', 'AVERAGE_LAST_WEEK', 'NUMBER_OF_FAMILY_MEMBER']
# Convert categorical variable (WEATHER) to numerical values
data['WEATHER'] = data['WEATHER'].map({'Sunny': 1, 'Cloudy': 2, 'Rainy': 3})
# Separate the target variable (CONSUMPTION) from the dataset
X = data[variables]
y = data['CONSUMPTION']
# Split the dataset into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)
# Train a linear regression model and use RFE to identify the best set of variables
model = LinearRegression()
rfe = RFE(estimator=model, n_features_to_select=1)
rfe.fit(X_train, y_train)
# Get the ranking of variables
ranking = rfe.ranking_
# Print the ranking of variables
print("Variable Ranking (1=Best):")
for var, rank in zip(variables, ranking):
print(f"{var}: {rank}")
# Train the model using the best set of variables
best_variables = [var for var, rank in zip(variables, ranking) if rank == 1]
model.fit(X_train[best_variables], y_train)
# Evaluate the model using mean squared error (MSE)
y_pred = model.predict(X_test[best_variables])
mse = mean_squared_error(y_test, y_pred)
print(f"\nBest set of variables: {best_variables}")
print(f"MSE for the best set of variables: {mse}")
As you can see, this code snippet will load your dataset, preprocess the data (converting the categorical WEATHER variable to numerical values), and then use recursive feature elimination (RFE) with a multiple linear regression model to identify the best set of variables for predicting water consumption. It calculates the mean squared error (MSE) for the best set of variables and prints the results.
A data scientist can so evaluate model results and share the outcome with business teams to take a decision.
Conclusion
In this article, I talked about an innovative approach to deliver AI projects, where the use of the generative AI technology can boost productivity and efficiency of business teams and data scientists to answer to complex business question. In particular, generative AI can support all the steps of an artificial intelligence project, from the analysis, to the design to the implementation, to the model implementation, to the documentation task.
Note that this is just an example to explain the power of the approach; it make sense to think about an end to end solution to support this activities in an enterprise way.
#generativeai #datascience #datasciencesuperpower #watsonx