AI and traditional forecasting for water utilities
Timeseries and Water by MidJourney

AI and traditional forecasting for water utilities

This article explores the differences between AI and traditional forecasting methods, highlighting their applications and effectiveness in various fields. We'll delve into personal experiences with AI, examine the current state of the technology, and discuss its potential impact on industries such as water distribution.

My history with AI

During my master's program, I was introduced to machine learning techniques for control systems (which manage and regulate the behaviour of other devices or systems). At first, it was pretty vague to me, and I had multiple thoughts: "Do we not have to model the system anymore fully?" and "Can it handle situations where traditional controllers are not tuned for?" These are control systems terms, so if you're unfamiliar with them as they are not the topic, that's no problem. However, it was very interesting for a lazy control engineer.

After my studies, I started working for a pipeline inspection company with vast amounts of data ready to be used for training computer vision models. Since the product was labelled NDT (non-destructive testing, which involves inspecting materials without causing damage) data, it was an ideal case for deep learning. Many deep learning models are supervised learning models, which means you give the model the available input data and the desired output (the labels). Then the model can iterate over many learning moments (epochs, which are iterations where the model learns from the data) to correctly fit weights between the input and output. However, at that time, I couldn't work on it; it would have also been a great startup idea ??.

Luckily, I later had the opportunity given by Acquaint to implement these ideas again, and it was a bit easier to build these models (TensorFlow 1.0.0 was almost available). Back then, we called it deep learning, but it was also referred to as AI.

Fast-forward to today, and it's much easier to create these models and add real value. The definitions of AI and machine learning are now (wrongly) separated. AI typically refers to LLM models (ChatGPT). This is a glimpse of my history with AI and machine learning.

The field of AI

In the figure below, the definitions of different levels of AI are given:


The field of artificial intelligence

The field of Artificial Intelligence

The field of AI:

  • Artificial Intelligence (AI) is a broad and fascinating field focused on creating intelligent machines. Intelligence is a debatable term, but that's not the topic of this article.
  • Within AI, there's Machine Learning (ML), which focuses on recognizing patterns in data. Once we spot a pattern, we can apply it to new situations, making ML incredibly powerful. We'll explore this idea more soon.
  • Deep Learning, a branch of ML, deals with unstructured data like text and images. It uses artificial neural networks inspired by our brains to make sense of this data.
  • Large Language Models (LLMs) focus on understanding and generating text. Let's dive into how they work and what makes them so intriguing.

And let's check what ChatGPT has to say about the difference between AI and machine learning:

Artificial Intelligence (AI) is a broad field focused on creating systems that can perform tasks requiring human-like intelligence. Machine Learning (ML) is a subset of AI that involves developing algorithms to recognize patterns in data and make predictions or decisions based on that data. While AI encompasses a wide range of technologies and approaches, ML specifically relies on data-driven learning to improve over time. Essentially, AI aims to simulate human cognitive functions, and ML is one way to achieve this by allowing systems to learn from experience. In summary, AI is the overall goal of creating intelligent machines, and ML is a key method for achieving that goal through pattern recognition and data analysis.

Ok, that sounds interesting: "AI aims to simulate human cognitive functions, and ML is one way to achieve this by allowing systems to learn from experience"

I am not sure if that makes the difference more straightforward, but I will give it my two cents:

As AI is the general field, the difference compared to machine learning is insignificant. Whether you are building linear regression models in a bachelor's degree program or working at Google DeepMind on the newest LLM, you are practising in the field of AI. However, the applications are very different, and each has its own use. Often, the simplest solution is the most elegant (like linear regression models) as long as it does the job correctly.

The cool thing is that you can ask an LLM to make a linear regression model for you. We've reached this critical point with tools like ChatGPT—they can make your life easier by handling simple tasks. However, they can also take away your understanding of creating models (AI) yourself, and we end up with prompt engineers instead of seasoned data scientists. In my opinion, it's essential to understand and appropriately use the various techniques in the AI spectrum.

The water industry

Let me link this to the water industry. In water distribution today, there is an abundance of time series data available, such as:

  • sensor data
  • pipe failures
  • maintenance reports
  • etc.

We also consider pipe failures and maintenance reports to be time series data. They may be infrequent but can be combined. Currently, HULO mainly uses sensor data and time series forecasting to understand what happens in the distribution network, but without network models, as these are often incorrect or unavailable. We use various degrees of machine learning but not LLMs for leak detection, as they are unsuitable for our needs. Deep learning models allow us to interpret data from hundreds of sensors. Using relatively simple models also helps us deliver sustainable solutions for water utilities. Fortunately, we have robust Green Software advocates on our team.

In traditional forecasting, we often talk about forecasting sensor data or water demands. This is usually done using linear regression models or their variations. These models can provide good results, but they often struggle with new events or data not included in the training set (extrapolating).

At this point, it's important to revisit the difference discussed earlier. More advanced models can extrapolate new information from unseen data, which is beneficial in unexpected situations like hot weather or sports events.

Below is a figure comparing traditional forecasting to one of HULO's models. You can see that more complex models with multivariate inputs (HULO) show much better results compared to a forecast model (ARIMA). ARIMA models work well for predicting water forecasting or demand for production locations. However, events after 18:00 and 21:00 could cause many false positives for any water utility if used for leak detection.


HULO multivariate model compared to a more traditional ARIMA model

Final remarks

At last, I would like to discuss the possibilities of AI, especially LLMs, for water utilities ??. LLMs offer excellent capabilities in preserving the knowledge of the retiring generation and providing interfaces for less technical people to interact with complex systems and data. There are examples where you can ask questions about hydraulic simulations, such as SWAN's lighthouse. However, I also want to highlight that many of these new interfaces are just slightly retrained ChatGPT models and do not consistently deliver your expected results, as each water utility is unique.

As we explore these technologies, we must remain realistic about their limitations and potential. The excitement surrounding AI and LLMs is part of a more significant trend that follows the technology hype curve. This curve illustrates the various stages of public perception and adoption of new technologies.

Hype cycle of emerging technologies, 2023, source: Gartner (August 2023)

It is crucial to continue discussing data to gain knowledge, as we are currently at the peak of the technology hype curve for generative AI (e.g., ChatGPT). The real value is not achieved now but at later stages, as we soon enter the valley of disillusionment and we all get AI tired (at least that is my expectation). To reach the point of real value, we must consider the challenges we want to overcome or the outcomes we seek as an industry (such as preserving knowledge or reducing water losses). But, just like when I graduated, it all starts with handling data correctly to gain value from technologies like generative AI.

So, my takeaway for any utility should be to preserve data correctly and label it accordingly so future AI models (possibly from HULO) can add real value to tomorrow's challenges. Thank you for reading, and I look forward to your responses to the blog and your views on how we can best use today's technologies.

Adi Lev-Tov

Freelance at Complex Change

7 个月

As you rightly write: "The real value is not achieved now but at later stages, as we soon enter the valley of disillusionment and we all get AI tired (at least that is my expectation). To reach the point of real value, we must consider the challenges we want to overcome or the outcomes we seek as an industry (such as preserving knowledge or reducing water losses)". So if this technology is not ready yet, as you write, how can we introduce a not-finished algorithm in water? CrowdStrike was an excellent example and reminder for what is happening when we release not tested technology and test it in field.

要查看或添加评论,请登录

社区洞察

其他会员也浏览了