Real-Time Predictions of Black Swan Events using ChatGPT, Transactional Machine Learning (TML), and Apache Kafka
Sebastian Maurice, Ph.D.
Global AI and Machine Learning Leader | Teacher | Inventor | Author | Blogger | Coder
A wet and dark afternoon here in Toronto, making it a perfect time to write a blog. I was recently posed a question by Logan Mintz of predicting the "unpredictable" without knowing what you want to predict. This was an intriguing question, thanks Logan, and I could not stop thinking about it.
There has been lots of work on predicting black swan events. Many of us probably read Nassim Taleb's book The Black Swan. This book made us aware of the unpredictable nature of events like earthquakes, wild fires, pandemics, etc. that are extremely rare, but when they occur are highly impactful - what we call "tail events" (of some distribution). Machine learning (ML) is not good at predicting black swan events, for the simple fact that ML relies on historical data that already occurred, to predict the future.
Machine learning (ML) is not good at predicting black swan events, for the simple fact that ML relies on historical data that already occurred, to predict the future.
There are other recent methods for Black Swan predictions like from Stanford University researchers, and recent research from scientist from US that use Bayesian methods to predict the unpredictable. For me these methods advance our thinking of how AI and unsupervised methods can be used to predict the unpredictable. One issue I have with these methods is that they lack a real-time component and are complex to implement. Specifically, the question I asked myself is: "Is there a simpler way to determine or predict unpredictable events in real-time or as the data is being generated using TML, Kafka, and ChatGPT?"
One issue I have with these methods is that they lack a real-time component and they are complicated to implement. Specifically, the question I asked myself is: "Is there a simpler way to determine or predict unpredictable events in real-time or as the data are being generated using TML, Kafka and ChatGPT?"
The answer is maybe. One possible way that we can gain further insights into unpredictable events in real-time is to use TML, Kafka, and ChatGPT. The method is fairly simple and in real-time. Here is how. Let's take a healthcare use case like data from healthcare organizations on patients' symptoms. Specifically, blood pressure, heart rate, body temperature, oxygen levels, and body weight. WE DO NOT KNOW WHAT WE ARE LOOKING FOR IN THE DATA SO WE CANNOT USE MACHINE LEARNING FOR THE ANSWER (whatever that is). It may be that there is some disease that is trending in the population. The process is as follows:
The solution architecture is shown below; the actual implementation of this is shown in the Dashboard below. While this is using healthcare data, exactly the same methodology can be applied to almost anything: Sports, Finance, Gambling, IoT, manufacturing, etc.
领英推荐
The dashboard below implements the architecture. We streamed and processed data in STEPS 1-5. We then sent all of this processed data to ChatGPT to analyse, we asked ChatGPT (STEP 6): "Is there a health issue in the population? If so, what are the issues? What city do majority of the patients live in? How many males with issues? how many females with issues? What languages do they speak? Is there a disease trend? How many total patients analysed? Is the trend increasing or decreasing?"
CHATGPT RESPONSE: Based on the information given, it appears that there is a health issue in the population being monitored. The issues seem to be related to respiratory rate and diastolic blood pressure. The majority of the patients appear to live in Miami, and there seem to be more males with issues than females. The languages spoken by the patients include English, Spanish, and French. There does not seem to be a disease trend, but rather a general health issue in the population. It is not stated how many total patients are being analysed, but the trend does not seem to be increasing or decreasing.
Pretty COOL!
Now, this is not meant to solve or overly simplify the very complex quest of predicting Black Swan events, to the contrary. We are taking the complex nature of the problem of predicting Black Swan events, and breaking the problem down into smaller components (patients) that is: 1. Real-time, 2. Performing and processing data from individual patients, 3. Grouping the preprocessed patients' data, 4. Sending the grouped data to AI (ChatGPT) to further analyse the "preprocessed" data 5. Visualizing the results and repeating the process.
NOTE: The responses from ChatGPT can be grouped into a "SUPER" group, and this Super Group of responses can be re-sent to ChatGPT to summarize.
Advances in real-time processing of streaming data with Kafka, TML and Generative AI are opening up tremendous potential in large scale, in-memory, data processing that is low-cost and high value for scenarios and problems that have been VERY hard to solve, let alone finding a way to get further insights into the problem's solution set.
Advances in real-time processing of streaming data with TML and Generative AI are opening up tremendous potential in large scale, in-memory, data processing that is low-cost and high value for scenarios and problems that have been VERY hard to solve, let alone finding a way to further insights into the problem's solution set.
Till next time..
Principal Project & Product Management
7 个月Hi Sebastian, Interesting concept here, how would you propose to mitigate hallucinations?
The evolution of Machine Learning