A Beginners Guide to Machine Learning For IoT Controls (MLoT) Part-1
U. Francis Okechukwu

A Beginners Guide to Machine Learning For IoT Controls (MLoT) Part-1

September 29, 2023?| youtube.com/@codewithfrancis

Simplified Introduction to the Machine Learning of Things (MLoT) with ESP32, Python & NodeJS:


Abstract

This article is a light introduction to Machine Learning (ML) and its application to the Internet of Things (IoT). It can serve as guide to software engineers and hobbyists who are looking into applying ML in the development of smart devices. This article covers some basics in ML and IoT, and provides code snippets and code samples to build a simple MLoT system that can control a users smart home cooling system. This is meant to be a light reading for the Code With Francis Youtube Channel.

Project/Coding Resources:

Introduction

As the world becomes increasingly interconnected through the Internet of Things (IoT), with billions of devices and sensors gathering, transmitting and processing data across various sectors, from smart agriculture, cities, manufacturing and homes; there becomes a need to develop an ecosystem that can master the control complexities of these systems to enable smarter decision-making, automation, and optimization. Machine Learning (ML) provides enhancing mechanisms for IoT control by using historical data in the predictions of more accurate outcomes without being explicitly programmed to do so.

Truly, the application of ML in smart systems can be said to improve the system smartness.

In general, ML algorithms can be used to analyze large amounts of data from IoT devices to identify patterns and trends. This algorithm can then provide output (target) predictions that can be used to automate tasks, improve decision-making, and optimize system performance.


Benefits of using ML for IoT control

  • Improved customer satisfaction: ML integration into IoT systems can offer its users personalized controls and recommendations that have been fine-tuned based on a user or set of users profile preferences.
  • Reduced costs: When applied properly, the application of ML can greatly reduce an IoT systems operational costs, energy costs, preventive maintenance costs.
  • Improved productivity and system efficiency: The application of ML in IoT system optimizations and control automation can greatly improve the systems productivity and efficiency.
  • Enhance system security and safety: ML can be used to develop secure IoT systems that can detect and respond to external system threats in real time, thereby improving security and safety risks.

ML systems usually require models (mathematical functions) which are sometimes called “Black Box” to make predictions.


What is an ML model?

A machine learning (ML) model is a mathematical function that has been trained on data to perform a specific task.

ML models can be trained using a variety of algorithms.

Once a model has been trained, you can use just the model to make predictions for new data.


Building an ML Model

steps to build an ML model

  1. Problem Definition. What am I trying to solve? What should my model to do?
  2. Gather Historical Data. Gather historical datasets that are relevant to your problem statement. This data should represent the problem statement and include features "X" (or independent variables, input variables) and class labels "Y"(output, target, output variable).
  3. Clean, Process and Explore Data. Handle missing data, removing outliers, covert into acceptable data and then split the data into training and testing data sets.
  4. Model Selection, Evaluation and Training. Select an appropriate algorithm, feed your features (X) data into the selected ML algorithm and allow it to learn from the data.
  5. Tuning, Evaluation and Validation. Evaluate your models output and its performance on the test-set data that was split-out in step 3. This assess how well the model will generalize to new data. If this does not meet your expectations, return to step 3 or adjust the ML algorithm parameters selected in step 4. Continue the tuning and evaluation process until an acceptable performance is determined.
  6. Model Deployment. With an acceptable performance determined, deploy the ML model so it can make predictions with new datasets.


Deployed ML Model

Some ML Training Algorithms

Here is a list of some ML Training Algorithms.

ML Training Algorithms

For more in-depth study, here are some practical courses:


Our Problem Statement

Francis :-) lives in a two-room apartment with his very furry dog Parsons. Unfortunately, Francis does not own a central air conditioning system, but has cooling fans in both rooms. It is summer, and Francis understands that Parsons will need his room cooled during daytime on the weekdays especially when he is at work. In this exercise, we will design an MLoT system for Francis that can learn from his personal room fan settings ("Y") and then make control predictions based on input from the control sensors ("X") and time settings in his apartment.

System Design Assumptions

Simple System Overview

For the purpose of this tutorial, we will assume:

  • We have collected sensors (features) & fan (target) data in sample-data-link
  • Data collected includes the Features and targets belowfeatures: time, period of day, weekday, studio_motion (Francis room motion detected), dog_motion (Parsons room motion detected), studio_temp (Francis room temperature in celsius), dog_temp (Parsons room temperature in celsius), studio_fan on/off state, dog_fan on/off state.

sample data image

  • Our IoT system uses the ESP-32 microcontroller and MQTT to transmit and receive data from ML API Service. To reduce the work load on the ESP-32, an MQTT Event Hub / Publisher was added to the system to forward messages from the ESP-32 to our ML API Service.

Simple System Design with ESP-32 and MQTT

Smart MLoT System Diagram

Circuit Diagram

LED is used to represent Fan on/off state

Circuit Diagram

For circuit simulator and code, visit https://wokwi.com/projects/376687507269195777, or https://wokwi.com/projects/376642091203228673


ML training code (python): colab-code-link

Run the colab link above to build and export the custom MLoT model

import gspread
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from google.colab import auth
from google.auth import default        

Gather & Load Historical Data

# auth/load google credentials
auth.authenticate_user()
creds, _ = default()
gc = gspread.authorize(creds)

# Open the Google Sheet by its URL (ensure it's publicly accessible)
sheet_url = 'https://docs.google.com/spreadsheets/d/1SLwix8SFx3VUShDSNgh8-uu1h2f-UulPTu2t5CXWRnM/edit'  # Replace with your Google Sheet URL

# Authenticate and open the spreadsheet
spreadsheet = gc.open_by_url(sheet_url)

# Select a specific worksheet
worksheet = spreadsheet.worksheet("sample-data-fan-control")

# Get all values from the worksheet
values = worksheet.get_all_values()

# Convert to a Pandas DataFrame
df = pd.DataFrame(values[1:], columns=values[0])  # Assuming the first row contains column headers        

Clean, Process and Explore Data

# clean up data
weekday_map = {'sunday': 0, 'monday': 1, 'tuesday': 2, 'wednesday': 3, 'thursday': 4, 'friday': 5, 'saturday': 6}
df['day'] = df['day'].map(weekday_map)

truth_map = {'FALSE': 0, 'TRUE': 1}
df['studio_motion'] = df['studio_motion'].map(truth_map)
df['dog_motion'] = df['dog_motion'].map(truth_map)

on_off_map = {'off': 0, 'on': 1}
df['studio_fan'] = df['studio_fan'].map(on_off_map)
df['dog_fan'] = df['dog_fan'].map(on_off_map)

# Function to convert time stamp string to seconds
def time_to_seconds(time_str):
    hours, minutes, seconds = map(int, time_str.split(':'))
    return hours * 3600 + minutes * 60 + seconds

# Convert the 'Timestamps' column to seconds in day
df['time'] = df['time'].apply(time_to_seconds)

# -------------------------------------------
# seperate studio fan and dog fan  data
# select studio fan and dog fan state DataFrame
df_st = df[['time','day','studio_motion', 'studio_temp','studio_fan']]
df_do = df[['time','day','dog_motion', 'dog_temp','dog_fan']]

# -------------------------------------------
# seperate features/Inputs (X) and targets (classes: y)

# condition analysis (target data, Class)
y_st = df_st['studio_fan']
y_do = df_do['dog_fan']

# independent variables (Features/Input)
X_st = df_st.drop(['studio_fan'], axis = 1)
X_do = df_do.drop(['dog_fan'], axis = 1)

# -------------------------------------------
# split training and test data

X_train_st, X_test_st, y_train_st, y_test_st = train_test_split(X_st, y_st, test_size=0.20, random_state=91)
X_train_do, X_test_do, y_train_do, y_test_do = train_test_split(X_do, y_do, test_size=0.20, random_state=91)        

Model Selection, Evaluation and Training

# init and train RandomForestClassifier for Studio Fan Model

# time, day,	studio_motion,	studio_temp
class_weights = {0: 10, 1: 27, 2: 4, 3: 40}  # Data Tuning: Assign a higher weight to classes with higher importance

model_st = RandomForestClassifier(oob_score=True, max_depth=8, random_state=90, class_weight=class_weights)
model_st.fit(X_train_st,y_train_st)

# Calculate estimated OOB (Out Of Bag) score
""" The OOB score can serve as a useful estimate of how well your random forest model is likely
to perform on unseen data without the need for a separate validation set.
However, it's still a good practice to use additional evaluation techniques
like cross-validation to assess the model's performance thoroughly. """

print(f'Studio Fan OOB SCORE: {model_st.oob_score_}')        

Tuning and Evaluation

# init and train RandomForestClassifier for Studio Fan Model

# time, day,	studio_motion,	studio_temp
class_weights = {0: 10, 1: 27, 2: 4, 3: 40}  # Data Tuning: Assign a higher weight to classes with higher importance

model_st = RandomForestClassifier(oob_score=True, max_depth=8, random_state=90, class_weight=class_weights)
model_st.fit(X_train_st,y_train_st)

# Calculate estimated OOB (Out Of Bag) score
""" The OOB score can serve as a useful estimate of how well your random forest model is likely
to perform on unseen data without the need for a separate validation set.
However, it's still a good practice to use additional evaluation techniques
like cross-validation to assess the model's performance thoroughly. """

print(f'Studio Fan OOB SCORE: {model_st.oob_score_}')



# -------------------------------------------



# init and train RandomForestClassifier for Dog Fan Model

# time, day,	studio_motion,	studio_temp
class_weights = {0: 30, 1: 20, 2: 4, 3: 40}  # Data Tuning: Example: Assign a higher weight (5) to class 1

model_do = RandomForestClassifier(oob_score=True, max_depth=8, random_state=91, class_weight=class_weights)
model_do.fit(X_train_do,y_train_do)

# Calculate estimated OOB (Out Of Bag) score
""" The OOB score can serve as a useful estimate of how well your random forest model is likely
to perform on unseen data without the need for a separate validation set.
However, it's still a good practice to use additional evaluation techniques
like cross-validation to assess the model's performance thoroughly. """

print(f'Dog Fan OOB SCORE: {model_do.oob_score_}')        

Model Validation

# Calculate Accuracy
from sklearn.metrics import accuracy_score, classification_report

# calculate studio fan state accuracy
y_pred_st = model_st.predict(X_test_st)
accuracy_st = accuracy_score(y_test_st, y_pred_st)
print(f'Studio Fan Accuracy: {accuracy_st}')

# calculate dog fan state accuracy
y_pred_do = model_do.predict(X_test_do)
accuracy_do = accuracy_score(y_test_do, y_pred_do)
print(f'Dog Fan Accuracy: {accuracy_do}')



# -------------------------------------------


# calculate and visualize the confusion matrix
""" A confusion matrix is a table used in machine learning to evaluate the performance of a classification model, showing true positives, true negatives, false positives, and false negatives."""

import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import confusion_matrix

def plot_confusion_matrix(y_test, y_pred, label_name):
  # Generate a confusion matrix
  confusion = confusion_matrix(y_test, y_pred)

  # Calculate percentages for each cell in the confusion matrix
  total_samples = np.sum(confusion)
  confusion_percent = (confusion / total_samples) * 100

  # Plot the confusion matrix with percentages
  plt.figure(figsize=(4, 3))
  sns.heatmap(confusion_percent, annot=True, fmt='.2f', cmap='Blues', cbar=False, square=True)
  plt.xlabel('Predicted Labels : ' + label_name)
  plt.ylabel('Actual Labels')
  plt.title('Confusion Matrix (Percentages)')
  plt.show()


plot_confusion_matrix(y_test_st, y_pred_st, "Studio Fan Model")
plot_confusion_matrix(y_test_do, y_pred_do, "Dog Fan Model")        

Model Download and Deployment

import joblib
from google.colab import files

# Download Studio Fan Model
joblib.dump(model_st, "studio_fan_model.joblib")
files.download("studio_fan_model.joblib")

# Download Studio Fan Model
joblib.dump(model_st, "dog_fan_model.joblib")
files.download("dog_fan_model.joblib")        


With our model trained and downloaded, we can now deploy our model to our ML API service.

  • Download the ML API Service (Python): api-code-linkPlease read the README file for full installation instructions.
  • Upload the downloaded models into the "models" folder
  • Deploy the ML API Service. The API service code provided requires the use of FastAPI please read the README file for full installation instructions or visit https://fastapi.tiangolo.com/


Running the Simple MLoT System

uvicorn main:app --reload        

Simulator Device when Connected & Running
Swagger API interface for ML API Service
When using Simulator, this API can be used to set system defaults.
This is the API accessed by the ESP-32 on sensor value changed


Conclusion

The application of Machine Learning in IoT is revolutionizing the smart-things industry. The ability to use data-driven insights to predict future trends, behaviors and outcomes that can give customers a fine-tuned personalized control, and recommendations is a game changer that can improve customer lives. As IoT continues to evolve, the synergy between ML and IoT technologies will play a pivotal role in shaping a smarter, more connected world.

It is worthy to note that there exists various challenges such as data privacy and security, model development and deployment, and real-time performance however solving these issues unlocks the full potential of MLoT.

The possibilities for innovation in the future is endless with application in smart homes, smart cities, smart manufacturing, smart industry, smart healthcare, etc...

A world where devices and ML systems work together seamlessly could be of great benefit to the society.





要查看或添加评论,请登录

Ugochukwu Francis Okechukwu的更多文章

社区洞察

其他会员也浏览了