59 Machine learning v curiosity
Machine learning in the water industry

59 Machine learning v curiosity

The image above was used by Ofwat to represent innovation in the UK water industry.?If you look carefully you will notice that none of the gears mesh together properly so the machine won’t work.?Maybe not what they intended.?

I am finally getting round to reading the book “A Strategic Digital Transformation for the Water Industry”, published last year by the International Water Association.?It is available at this link.?Well done to the editors Oliver Grievson, Timothy Holloway and Bruce Johnson for putting together such a comprehensive review of what is and should be being done to transform the water industry.?If anything it is too comprehensive at 119 pages and I am still ploughing my way through it.?However, one figure jumped out at me and linked in with some work that I did recently, hence this blog.?There may be more blogs on other ideas from the book in the future.

Analysing treatment works flow records

Back in Episode 22 of the blog I talked about analysing compliance with required Flow to Full Treatment (FFT) and referenced a good article in The Guardian on a citizen science approach to this.?The IWA book showed another example (see the figure below).

No alt text provided for this image

The book states: “Figure 2.5 shows four clear peaks which grow over a period of three years showing the gradually worsening infiltration into the sewer environment.”

This is true, but there is actually a lot of non-artificial intelligence in extracting that information from the graph and even then we are making some significant assumptions to state that the cause is worsening infiltration.?Even if it is, we do not know if it is due to deteriorating condition of the sewers or climate change increasing the level of the water table.

To be fair to the editors, they do say that this is as simple example of the use of data rather than an example of digital transformation and qualify the assumptions as; “With additional information, such as rainfall, geology, and the performance of the system, this can be used to predict where the source of the problems within the sewer network is.”?

We could perhaps use machine learning or artificial intelligence to provide those insights for us.?Or can we just use human curiosity?

I recently did some analysis of treatment works flow records that provided some additional insight into what is going on, but using some simple data hacks rather than complex data science.

The data

I started with five years of daily flow data, similar to that shown above.?I also had daily rainfall depths for the same period.?It was a bit less obvious what the patterns were as you can see from the graph below.

No alt text provided for this image

I found that there was too much noise in the daily data to easily make sense of it, so I did my investigations by aggregating to monthly data.

Baseflow

The first thing that I looked at was the average dry weather flow through the treatment works.?The official definition of this in the UK is the daily flow that is exceeded for 80% of the time.?That is, we ignore the 20% lowest values as they may be due to monitor error or operational issues, but we take the lowest of the rest of the values as the dry weather flow.?This is a really useful definition as it doesn’t actually involve proving that it is dry weather by matching up rainfall data.

This calculation gave a DWF of 93.5.

Seasonal variation

A trick that I have used for a long time to help understand infiltration into sewerage systems is to do this same calculation for each month rather than over the whole year.?This shows if there is a seasonal variation in baseflow.?If there is no variation then the average of the monthly values is the same as the annual value.?If there is seasonal variation then the average of the monthly values is higher than the annual value.?

The results for this analysis are shown in the graph below compared to the straight line value from the annual analysis.?This shows a large seasonal difference in baseflow suggesting that the catchment suffers from a lot of infiltration in the wet months of the year.

No alt text provided for this image

I then assume that the lowest monthly value of baseflow is the true base wastewater flow and that the rest is infiltration.?This gives the graph below.

No alt text provided for this image

Beware that this assumption that the lowest baseflow value is the true wastewater flow may not be correct.?There could be an operational issue affecting the month with the lowest value, although in this example there are other months with very similar values so that is unlikely.?Also those dry months could be exhibiting exfiltration with some of the wastewater leaking out of the sewers (see Episode 55).?I will ignore that issue for now as it is almost impossible to take it into account.

Peak flow – slow response

The official definition of the peak flow that a wastewater treatment works has to be able to process is based on a completely different definition of flow in dry weather.?This is the maximum recorded flow on a dry day (less than 0.25 mm of rainfall) following another dry day.?(It would be so much easier if this also used a percentile definition but with a much higher percentile than for the DWF.)

As I do have the rainfall data for this location I can calculate this maximum dry day flow and add it to the baseflow.?This is shown as green in the graph below.?This is not direct runoff as it occurs on a dry day, but it is not base infiltration as it is not included in the baseflow.?This is slow response to rainfall on previous days.

No alt text provided for this image

Peak flow - fast runoff

The final component is to show the maximum flow through the works including wet days.?This gives the pink area in the graph below.?This does not add very much to the peak flow.?This is partly because this catchment is notionally largely separate with limited areas contributing runoff and also perhaps because there may be overflows and storm tanks lopping off the short term peak response to rainfall.

No alt text provided for this image

The overall results give a snapshot of the response of the system.?

Foul flow????????????????????35%

Seasonal infiltration???35%????Significant issue

Slow response???????????15%????Less of a problem than infiltration

Fast response????????????15%????Largely separate system

Conclusions

The results of the citizen science project reported by The Guardian and the analysis that I have set out here show that there is a lot that we can understand about the operation of our drainage and wastewater systems by relatively simple analysis of existing data.

Neither of these used machine learning or artificial intelligence, just non-artificial curiosity.?Can we give water company staff the time and the incentives to be more curious?

Anthony Fernihough

Associate Director at AtkinsRéalis

1 年

George Walter Clapp - this maybe of interest

Leo Kiernan

turning data into actionable insight

1 年

Domain specific knowledge is hard won and invaluable.

回复
Andrew Scott

Co-owner and developer at Meteor Communications (Europe) Ltd

1 年

Just for context, the latest Deep Learning Neural Network that I have built (on my budget) has 123 million 'Neurons', the human brain is thought to have 86 billion Neurons, plus Neuroscience is still rapidly developing and feeding into the development of Data Science. Curious humans will be required for a long time yet, at least until Quantum computers become mainstream.

David Brydon

Trade effluent consultancy

1 年

As someone who has a long history with "AI" in the water industry I can confirm that there's a lot of guff talked about the subject. Martin's blog hits the nail on the head. When I first started using neural networks (1993), I was told that 75% of the effort in developing them was basic data analysis, 15% was setting up the software (it was much more laborious then), 5% developing the models and 5% testing them. That's still more or less true. Many times I go to develop a model and find the answer in the basic data analysis. The term "AI" first fell out of fashion in the 1960s when everyone realised the models simply had no intelligence. That killed research into neural networks for many, many years. We're going through the same hype cycle now but this time the models are more complex and the hype will probably (hopefully) regulate the "AI" world rather than kill it. There's always been a tendency to throw a machine learning model at some data and hope for the best but the really important bit is that last 5% - testing the models to make sure they can generalise accurately. If they can't then you have a useless model, which brings us back to making sure the basic data analysis is correct in the first place.

要查看或添加评论,请登录

Martin Osborne的更多文章

  • 115 Artificial Intelligence - talkin ’bout a revolution

    115 Artificial Intelligence - talkin ’bout a revolution

    I have heard and read a lot about “AI” in the last few weeks, so a few thoughts on what I have learnt. I wrote back in…

    12 条评论
  • 114 SuDS retrofit – carrot or stick

    114 SuDS retrofit – carrot or stick

    Last week I was at the annual CIWEM Urban Drainage Group conference and this year celebrating the 40th anniversary of…

    16 条评论
  • 113 Is the AI cavalry coming to our rescue?

    113 Is the AI cavalry coming to our rescue?

    There is growing talk about the shortage of expertise and skills in the water sector; but also talk in all sectors…

    10 条评论
  • 112 Moving the pollution goalposts

    112 Moving the pollution goalposts

    Apologies, another blog diving into the detail of policy and law in the United Kingdom. I hope that readers in other…

    20 条评论
  • 111 The Water Commission

    111 The Water Commission

    The big news for the water industry in the UK last week was the establishment of an independent commission into the…

    16 条评论
  • 110 Past, present and future tense

    110 Past, present and future tense

    The picture is the cover of Al Stewart’s great 1973 album Past, Present and Future. Still one of my favourite albums 50…

  • 109 He who pays the polluter

    109 He who pays the polluter

    There is an interesting report on public attitudes to the UK water sector called Building a societal licence, published…

    21 条评论
  • 108 Going (further) down the (highway) drain

    108 Going (further) down the (highway) drain

    After the last episode of the blog on highway drainage I had comments from Rob Cunningham with some insights into why…

    8 条评论
  • 107 Going down the (highway) drain

    107 Going down the (highway) drain

    In Episode 94 I talked about the issues of pollution from highway drainage and how the sources of it can be assessed…

    8 条评论
  • 106 The “Fundamental Intermittent Standards”: Are they suitable for use in 2021 Environment Act?

    106 The “Fundamental Intermittent Standards”: Are they suitable for use in 2021 Environment Act?

    Guest episode by Philippa C Mohan Introduction I have had several people say to me that when it comes to overflows…

    11 条评论

社区洞察

其他会员也浏览了