Everything you ever wanted to know about Big Data, Analytics, IoT (and Security)

Everything you ever wanted to know about Big Data, Analytics, IoT (and Security)

(but were afraid to ask)

Recently I was asked to explain Big Data and Analytics to some kids at a coding/technology group I am involved in (CoderDojo)

I used the following real-world example from my own home. It is a simple explanation of what Big Data is, why it's different from what we had before (small data, if you like), why analytics are important, and what the potential dangers might be.

Consider the humble electricity meter. In the olden days - well, OK, this is still the current situation for nearly everyone - you have something like this connected inline to your electricity supply to your home.

Once per billing period, a nice person from the electricity vendor comes to your home and reads your meter. In my area, the billing period is 2 months. So once per 2 months you get a bill. This bill tells you how much electricity you have used in the previous 2 months.

Well, actually, also in my area, the electricity company doesn't bother reading your meter once per 2 months. They actually read it once per 4 months, and every second bill is just an estimate. So actually, you find out once per 4 months how much electricity you consume. 3 data points per year.

This is not big data. This is teeny little data.

Now, because I am a big nerd - ask anyone - I decided some time ago that this was suboptimal. So using freely available technology and open source software of various types, a soldering iron, some arduinos and a raspberry pi (see? nerd) I made a variety of home monitoring probes, including a current sensor for my main electrical feed.

Now I can see my electricity consumption in realtime, with a reading taken each 15 seconds.

So instead of 3 data points per year, I get 4 readings per minute. 2.1 million data points per year. Still small data, but now we're getting somewhere.

So what, you might ask?

Well, a week of electricity consumption from my house looks like this:

A bit more interesting.
Let's zoom in a bit: one day of electricity consumption looks like this:

So we can see here that there are some major consumers of electricity here, but that there are three very high short peaks during the day which are major consumers of power.

Hmm ... wonder what that is? Let's overlay it with data from another sensor which measures the depth of water in a cold water storage tank (Internet Of Things, remember: many sensors, all connected, all the time):

So the peak power usage also corresponds to a strong demand for water.
This peak power (and water) user is an electric power shower.

Now, I know I have an electric power shower. I know it uses a lot of electricity. I am also dimly aware that I share my home with some teenagers, prone to taking showers at odd times of the day. So still you ask, so what?

Well, consider if this data were not available only to me - which is the case today, since I made my own smart meter.

Suppose I had a smart meter installed by my electricity company, and the data about my electricity consumption was flowing to them, rather than to me.

And suppose that everyone else in my city, or country, also has the same smart meters installed and the electricity company sees their detailed energy consumption data too.

So instead of my 2 million electricity data points per year (plus my other sensors, so let's say 40 million data points per year), they are collecting this from each of the 1.5 million households in Ireland. That's 60 trillion data points per year. Or suppose this is the UK (30 million households) - that's 1200 trillion data points per year. Or the US (120 million households) : nearly 5000 trillion data points per year. Now we are starting to approach something like Big Data.

And what can they do with this data?

Well suppose I am a vendor of that modern convenience, the electric power shower. Armed with just the electricity consumption data, it is trivial to determine which households already have instant electric power showers installed. And which do not. This information would be valuable to me, since I could target my marketing and advertising spend at just those households which do not have power showers. Therefore I would be prepared to purchase this information from the electricity company.

Or suppose my analysis is a bit smarter. Suppose I look for households who have such power showers, and where there was a daily pattern of use, but all of a sudden I no longer see any power shower usage but the other electricity consumption in the house stays normal. I might deduce from this that their shower had developed a fault, and therefore there was a high probability that shower maintenance/repair/replacement services would be very likely needed in those households. That information would be even more valuable - provided it could be delivered in a timely fashion.

It would not be valuable if it took 2 weeks to analyse and produce that report, since it is reasonable to assume that most households would have arranged for a repair of the faulty unit in the meantime. But if it is was within a day or so and so "just in time" it would be commercially very valuable.

This is what Analytics is: looking in data for patterns from which valuable inferences can be drawn and valuable actions can be determined, in a "reasonable" time, where "reasonable" depends on the data and the action.

And of course, if the shower itself was a connected device, the shower could just tell someone that it had a fault ... and remove even that elementary guesswork. But you could still keep the guesswork, while the population of showers installed was being "smartened" up over the normal replacement cycle.

I just picked on the shower there, but there are other consumers of electricity obvious from my monitoring. If I correlate those with other data from sensors in my house, and compare to other households across the country, I can probably figure out to a high degree of accuracy what those other appliances are, too. And have similarly commercially valuable data. Time to replace that appliance with a more energy efficient one? Appliance getting to the expected end-of-life? All commercially useful and valuable information.

So what's the problem? Isn't this all grand and convenient? Apart from my electricity vendor knowing some details about my family cleanliness and whether I need a new kettle or not, or my fridge is an older model, what's the harm?

Well, consider that it is also entirely trivial to determine with pretty much 100% accuracy from the pattern of power and water consumption of a house if it is currently occupied.

Can you imagine someone who might be prepared to pay for a realtime, guaranteed accurate to the nearest hour, country wide map of houses where the owners were away for the afternoon or the weekend? Yep, me too.

So the very creation of this detailed data opens the door for the abuse the information in various ways - some annoying, and some illegal: abuses which were simply not possible before, because the information was not available. Data about electricity consumption at a granularity of 3 data points per year is only interesting at a macro scale (e.g. for electricity generation or network planning). Data about electricity consumption at a micro scale is much more powerful, and valuable. And open to abuse.

Security of the data and control of access to it becomes very, very important: even for data as "harmless" as how much energy a home consumes.

Now consider what you might be able to find out by analysing the flood of data generated by all the mobile phones in a typical household.

So, in a nutshell, there you go. Big Data is what you get when you put a whole load of little data together. And that big data potentially contains far more interesting data than you might imagine at first glance. Analysing this data is a huge opportunity. Securing this data must always be a key consideration.

Giving away your own data freely is something you might want to consider a bit more carefully.

John Griffiths

Commercial Strategy | Senior executive leadership | General management | Product management | Marketing | Investor

9 年

Liam, great summary, straightforward and very clear.

回复
Brendan de Bruijn

Key Partner and Go To Market Lead at RDK Management

9 年

Nice article and also makes me feel better about my decision not to install a power shower in my new abode.

回复
Julian Wood

CEO at grq consulting ltd

9 年

Great write up Liam. Thanks for sharing and hope all is well on "the adjacent isle" ??

回复
Mélissa Chouikrat-Coyne, CISSP (she/her)

Solutions Consulting/ Presales Leader SaaS

9 年

Great stuff here, real life hack. Funny was listening to France Inter this morning, Evegeny Morozov's last pessimistic (still so much worth looking into) analysis on that very subject. Tout en FRE par contre !

回复
Matthew Murray

Experienced IT Professional

9 年

Interesting post Liam, very well explained...

回复

要查看或添加评论,请登录

Liam Friel的更多文章

  • Dutch Design Week

    Dutch Design Week

    It was great to join our Industrial Design team at our Eindhoven Industrial Design Center (NL) during #dutchdesignweek…

    3 条评论
  • Smart Home, Smart Strategy

    Smart Home, Smart Strategy

    If CSPs move quickly they can own the Smart Home market. Read how: https://www.

社区洞察

其他会员也浏览了