Exploring MLOps Productivity & Performance - Data Digest July Edition
Snapshot
In this month’s edition:?
“How I wrote a Machine Learning function using GPT4 that cut my MLOps time in HALF”
In my role as Chief Data Scientist for Spiral Data Group, we're handling massive time series data sets which are used to identify, remediate and innovate within a defined use case. A perfect example is transient detection patterns across a water utility's vast pipe network, where identifiable data signals are grouped to allow ML functions to identify and foresee transient issues.?
But how do we account within the ML functions for scope extension and the new characteristics that come with it? Such as location changes from a metro to regional area or seasonal differences in a network?
Knowing that the scale and complexity of our datasets will increase over time as our network relationships mature, it was clear we needed to improve the ML model used for transient identification.
This is when I turned to GPT4 and its suite of plugins. Could its large multimodal model get me closer to a solution quicker? Spoiler alert…?It can, and the productivity benefits are incredible.
“The cost to run real-world validation tests becomes exponential”
The many benefits of Virtual Asset Testing using a Digital Twin
Considering the volume of grey water passing through a pipe network, real-world sewer blockages are rare. They’re rare enough to make ML ground truth hard to come by. No matter, we've created our own virtual asset environment. A synthetic sewer network simulation replete with pipes, pumps, sumps, sensors, 'water' flow and… blockages.
Lots of them.
These blockages give us the data we need to establish ground truths for different blockage characteristics, thereby improving our ML model. We can readily adjust the simulation to mimic different networks and scenarios.
If you need to validate a hypothesis, proof of concept or investigate a new type of blockage then Virtual Asset Testing via a Digital Twin provides a go-to synthetic data test bed
Our Chief Data Scientist, Ram, muses:
“As algorithms proliferate through the water industry, the cost to run real-world validation tests becomes exponential. Our Digital Twin enables an initial feasibility at low-cost, simulate real-world anomalies, fast turnaround times to inform a subsequent GO/ NO Go implementation decision”?
AI regulation needs the lens of enablement, not disruption
Federal government’s intention to regulate AI may use risk classification
领英推荐
A number of countries, including the US, EU nations, the UK and Canada are ahead of Australia in their legislation or regulation of AI. This ‘fast-follower’ approach by Australia could provide an opportunity for sensible regulatory requirements, but it'll need industry consultation to ensure the considerations of a ‘risk-based’ approach doesn’t smother innovation.
From our experience, the goal should be scenarios such as continuity of critical infrastructure
Switching From Cloud to Briefcase
Spiral Data’s feature-rich IoT to AIML Platform is now available offline
This month marks the arrival of the ‘Briefcase’ (Defence-level security compliant, fully isolated environment, offline-only) version of our IoT to AIML Platform, following several years of cloud-based extensibility
Deployed via the rugged AWS Snowball server, it’s ideal for use-cases where connectivity is unreliable or network isolation is paramount, bringing IoT to AIML processing far closer to the ‘edge’. Remote areas, highly-sensitive deployments and open sea environments can take advantage of the same MLOps feature stack as those connecting to the Cloud.?
Snippets & Shortcuts
Our team’s pick of the talking points from the Industry 4.0 headlines:
Realising the massive potential of ML Ops in your organisation??
Like what you’ve read or keen to see something else??