How to Build a Streamlit App for Favorita Grocery Sales Forecasting Using Regression Model
Streamlit App

How to Build a Streamlit App for Favorita Grocery Sales Forecasting Using Regression Model

Are you interested in predicting future grocery sales for a retail corporation? If you're interested, you can check out my GitHub for more projects related to data science and machine learning. In this article, we'll walk you through how to build a Streamlit app using a regression model that was trained on the Favorita Grocery Sales dataset.

Data Description

The Favorita Grocery Sales dataset consists of transactional records of a retail corporation in Ecuador over a period of five years. The data contains information about store locations, item descriptions, on-shelf dates, promotions, and unit sales. The goal of the competition is to predict the unit sales for a set of test items and stores.

Model Training

To train our regression model, we used a combination of feature engineering and XGBoost regression. We started by cleaning the data, removing duplicates and missing values, and then engineered new features such as day of the week, month, and year. We also used one-hot encoding to convert categorical variables into binary features.

After feature engineering, we split the data into training and validation sets, trained an XGBoost regression model on the training data, and tuned the hyperparameters using grid search. Finally, we evaluated the model on the validation set and calculated the root mean squared logarithmic error (RMSLE) to measure the performance of the model.

Streamlit App Development

To develop our Streamlit app, we started by importing the necessary libraries, loading the trained model and encoder, and defining the input and output interfaces for the app. We then defined the prediction function, which takes user inputs, preprocesses them using the encoder, and feeds them into the trained model to make a prediction.

# Import necessary libraries
import streamlit as st
import pickle
import pandas as pd
from sklearn.preprocessing import LabelEncoder
from xgboost import XGBRegressor

# Load the trained model and encoder
model = pickle.load(open("model.pkl", "rb"))
encoder = pickle.load(open("encoder.pkl", "rb"))

# Define the input and output interfaces for the Streamlit app
st.title("Favorita Grocery Sales Forecasting")
store_item_id = st.text_input("Store Item ID", "0_0")
date = st.date_input("Date")
onpromotion = st.selectbox("On Promotion", ["True", "False"])
        


# Define the prediction function
@st.cache()
def predict_sales(store_item_id, date, onpromotion):
    df = pd.DataFrame({"store_item_id": [store_item_id],
                       "date": [date],
                       "onpromotion": [onpromotion]})
    df["store_id"], df["item_id"] = df["store_item_id"].str.split("_", 1).str
    df["year"] = df["date"].dt.year
    df["month"] = df["date"].dt.month
    df["day"] = df["date"].dt.day
    df["weekday"] = df["date"].dt.weekday
    df["onpromotion"] = encoder.transform(df[["onpromotion"]])
    df.drop(["store_item_id", "date"], axis=1, inplace=True)
    prediction = model.predict(df)
    return prediction[0]
        


# Call the prediction function and display the outpu
if st.button("Predict Sales"):
    prediction = predict_sales(store_item_id, date, onpromotion)
    st.write("Predicted Unit Sales: ", prediction)t        


Results

Our Streamlit app allows you to input a store item ID, date, and promotion status, and receive a prediction for the unit sales for that item and store. The app preprocesses your inputs and feeds them into the trained regression

要查看或添加评论,请登录

Stella Oiro的更多文章

社区洞察

其他会员也浏览了