A Beginners Guide to Machine Learning For IoT Controls (MLoT) Part-1
Ugochukwu Francis Okechukwu
Engineering Leader | Cloud Solutions Architect | IoT Enthusiast | Ex Google | MS in Computer Science
September 29, 2023?| youtube.com/@codewithfrancis
Simplified Introduction to the Machine Learning of Things (MLoT) with ESP32, Python & NodeJS:
Abstract
This article is a light introduction to Machine Learning (ML) and its application to the Internet of Things (IoT). It can serve as guide to software engineers and hobbyists who are looking into applying ML in the development of smart devices. This article covers some basics in ML and IoT, and provides code snippets and code samples to build a simple MLoT system that can control a users smart home cooling system. This is meant to be a light reading for the Code With Francis Youtube Channel.
Project/Coding Resources:
Introduction
As the world becomes increasingly interconnected through the Internet of Things (IoT), with billions of devices and sensors gathering, transmitting and processing data across various sectors, from smart agriculture, cities, manufacturing and homes; there becomes a need to develop an ecosystem that can master the control complexities of these systems to enable smarter decision-making, automation, and optimization. Machine Learning (ML) provides enhancing mechanisms for IoT control by using historical data in the predictions of more accurate outcomes without being explicitly programmed to do so.
Truly, the application of ML in smart systems can be said to improve the system smartness.
In general, ML algorithms can be used to analyze large amounts of data from IoT devices to identify patterns and trends. This algorithm can then provide output (target) predictions that can be used to automate tasks, improve decision-making, and optimize system performance.
Benefits of using ML for IoT control
ML systems usually require models (mathematical functions) which are sometimes called “Black Box” to make predictions.
What is an ML model?
A machine learning (ML) model is a mathematical function that has been trained on data to perform a specific task.
ML models can be trained using a variety of algorithms.
Once a model has been trained, you can use just the model to make predictions for new data.
Building an ML Model
Some ML Training Algorithms
Here is a list of some ML Training Algorithms.
For more in-depth study, here are some practical courses:
Our Problem Statement
Francis :-) lives in a two-room apartment with his very furry dog Parsons. Unfortunately, Francis does not own a central air conditioning system, but has cooling fans in both rooms. It is summer, and Francis understands that Parsons will need his room cooled during daytime on the weekdays especially when he is at work. In this exercise, we will design an MLoT system for Francis that can learn from his personal room fan settings ("Y") and then make control predictions based on input from the control sensors ("X") and time settings in his apartment.
System Design Assumptions
For the purpose of this tutorial, we will assume:
Simple System Design with ESP-32 and MQTT
领英推荐
Circuit Diagram
LED is used to represent Fan on/off state
For circuit simulator and code, visit https://wokwi.com/projects/376687507269195777, or https://wokwi.com/projects/376642091203228673
ML training code (python): colab-code-link
Run the colab link above to build and export the custom MLoT model
import gspread
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from google.colab import auth
from google.auth import default
Gather & Load Historical Data
# auth/load google credentials
auth.authenticate_user()
creds, _ = default()
gc = gspread.authorize(creds)
# Open the Google Sheet by its URL (ensure it's publicly accessible)
sheet_url = 'https://docs.google.com/spreadsheets/d/1SLwix8SFx3VUShDSNgh8-uu1h2f-UulPTu2t5CXWRnM/edit' # Replace with your Google Sheet URL
# Authenticate and open the spreadsheet
spreadsheet = gc.open_by_url(sheet_url)
# Select a specific worksheet
worksheet = spreadsheet.worksheet("sample-data-fan-control")
# Get all values from the worksheet
values = worksheet.get_all_values()
# Convert to a Pandas DataFrame
df = pd.DataFrame(values[1:], columns=values[0]) # Assuming the first row contains column headers
Clean, Process and Explore Data
# clean up data
weekday_map = {'sunday': 0, 'monday': 1, 'tuesday': 2, 'wednesday': 3, 'thursday': 4, 'friday': 5, 'saturday': 6}
df['day'] = df['day'].map(weekday_map)
truth_map = {'FALSE': 0, 'TRUE': 1}
df['studio_motion'] = df['studio_motion'].map(truth_map)
df['dog_motion'] = df['dog_motion'].map(truth_map)
on_off_map = {'off': 0, 'on': 1}
df['studio_fan'] = df['studio_fan'].map(on_off_map)
df['dog_fan'] = df['dog_fan'].map(on_off_map)
# Function to convert time stamp string to seconds
def time_to_seconds(time_str):
hours, minutes, seconds = map(int, time_str.split(':'))
return hours * 3600 + minutes * 60 + seconds
# Convert the 'Timestamps' column to seconds in day
df['time'] = df['time'].apply(time_to_seconds)
# -------------------------------------------
# seperate studio fan and dog fan data
# select studio fan and dog fan state DataFrame
df_st = df[['time','day','studio_motion', 'studio_temp','studio_fan']]
df_do = df[['time','day','dog_motion', 'dog_temp','dog_fan']]
# -------------------------------------------
# seperate features/Inputs (X) and targets (classes: y)
# condition analysis (target data, Class)
y_st = df_st['studio_fan']
y_do = df_do['dog_fan']
# independent variables (Features/Input)
X_st = df_st.drop(['studio_fan'], axis = 1)
X_do = df_do.drop(['dog_fan'], axis = 1)
# -------------------------------------------
# split training and test data
X_train_st, X_test_st, y_train_st, y_test_st = train_test_split(X_st, y_st, test_size=0.20, random_state=91)
X_train_do, X_test_do, y_train_do, y_test_do = train_test_split(X_do, y_do, test_size=0.20, random_state=91)
Model Selection, Evaluation and Training
# init and train RandomForestClassifier for Studio Fan Model
# time, day, studio_motion, studio_temp
class_weights = {0: 10, 1: 27, 2: 4, 3: 40} # Data Tuning: Assign a higher weight to classes with higher importance
model_st = RandomForestClassifier(oob_score=True, max_depth=8, random_state=90, class_weight=class_weights)
model_st.fit(X_train_st,y_train_st)
# Calculate estimated OOB (Out Of Bag) score
""" The OOB score can serve as a useful estimate of how well your random forest model is likely
to perform on unseen data without the need for a separate validation set.
However, it's still a good practice to use additional evaluation techniques
like cross-validation to assess the model's performance thoroughly. """
print(f'Studio Fan OOB SCORE: {model_st.oob_score_}')
Tuning and Evaluation
# init and train RandomForestClassifier for Studio Fan Model
# time, day, studio_motion, studio_temp
class_weights = {0: 10, 1: 27, 2: 4, 3: 40} # Data Tuning: Assign a higher weight to classes with higher importance
model_st = RandomForestClassifier(oob_score=True, max_depth=8, random_state=90, class_weight=class_weights)
model_st.fit(X_train_st,y_train_st)
# Calculate estimated OOB (Out Of Bag) score
""" The OOB score can serve as a useful estimate of how well your random forest model is likely
to perform on unseen data without the need for a separate validation set.
However, it's still a good practice to use additional evaluation techniques
like cross-validation to assess the model's performance thoroughly. """
print(f'Studio Fan OOB SCORE: {model_st.oob_score_}')
# -------------------------------------------
# init and train RandomForestClassifier for Dog Fan Model
# time, day, studio_motion, studio_temp
class_weights = {0: 30, 1: 20, 2: 4, 3: 40} # Data Tuning: Example: Assign a higher weight (5) to class 1
model_do = RandomForestClassifier(oob_score=True, max_depth=8, random_state=91, class_weight=class_weights)
model_do.fit(X_train_do,y_train_do)
# Calculate estimated OOB (Out Of Bag) score
""" The OOB score can serve as a useful estimate of how well your random forest model is likely
to perform on unseen data without the need for a separate validation set.
However, it's still a good practice to use additional evaluation techniques
like cross-validation to assess the model's performance thoroughly. """
print(f'Dog Fan OOB SCORE: {model_do.oob_score_}')
Model Validation
# Calculate Accuracy
from sklearn.metrics import accuracy_score, classification_report
# calculate studio fan state accuracy
y_pred_st = model_st.predict(X_test_st)
accuracy_st = accuracy_score(y_test_st, y_pred_st)
print(f'Studio Fan Accuracy: {accuracy_st}')
# calculate dog fan state accuracy
y_pred_do = model_do.predict(X_test_do)
accuracy_do = accuracy_score(y_test_do, y_pred_do)
print(f'Dog Fan Accuracy: {accuracy_do}')
# -------------------------------------------
# calculate and visualize the confusion matrix
""" A confusion matrix is a table used in machine learning to evaluate the performance of a classification model, showing true positives, true negatives, false positives, and false negatives."""
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix
def plot_confusion_matrix(y_test, y_pred, label_name):
# Generate a confusion matrix
confusion = confusion_matrix(y_test, y_pred)
# Calculate percentages for each cell in the confusion matrix
total_samples = np.sum(confusion)
confusion_percent = (confusion / total_samples) * 100
# Plot the confusion matrix with percentages
plt.figure(figsize=(4, 3))
sns.heatmap(confusion_percent, annot=True, fmt='.2f', cmap='Blues', cbar=False, square=True)
plt.xlabel('Predicted Labels : ' + label_name)
plt.ylabel('Actual Labels')
plt.title('Confusion Matrix (Percentages)')
plt.show()
plot_confusion_matrix(y_test_st, y_pred_st, "Studio Fan Model")
plot_confusion_matrix(y_test_do, y_pred_do, "Dog Fan Model")
Model Download and Deployment
import joblib
from google.colab import files
# Download Studio Fan Model
joblib.dump(model_st, "studio_fan_model.joblib")
files.download("studio_fan_model.joblib")
# Download Studio Fan Model
joblib.dump(model_st, "dog_fan_model.joblib")
files.download("dog_fan_model.joblib")
With our model trained and downloaded, we can now deploy our model to our ML API service.
Running the Simple MLoT System
uvicorn main:app --reload
Conclusion
The application of Machine Learning in IoT is revolutionizing the smart-things industry. The ability to use data-driven insights to predict future trends, behaviors and outcomes that can give customers a fine-tuned personalized control, and recommendations is a game changer that can improve customer lives. As IoT continues to evolve, the synergy between ML and IoT technologies will play a pivotal role in shaping a smarter, more connected world.
It is worthy to note that there exists various challenges such as data privacy and security, model development and deployment, and real-time performance however solving these issues unlocks the full potential of MLoT.
The possibilities for innovation in the future is endless with application in smart homes, smart cities, smart manufacturing, smart industry, smart healthcare, etc...
A world where devices and ML systems work together seamlessly could be of great benefit to the society.