A beginners guide to stated uncertainty
We live all our lives with uncertainty. We are aware that many things can and will affect our lives daily. Will it rain? Will those road works make me late? Some things are of course quite likely such as the rain and inevitable road works. Others are thankfully very rare such as earthquakes or being struck by lightning. So, though we often just accept uncertainty as a fact of life, we can take precautions to mitigate their impact as long as we understand the risks. Check the traffic warnings for example or avoid being outside in thunderstorms with an umbrella.
In manufacturing you may see an uncertainty statement made on a certificate or as a specification. Here in very simple terms is what that statement means.
Let’s take a typical journey to work as an example. Jack’s journey to work takes 20 minutes. usually... so if Jack is asked how long his journey to work is, he might say:
“The answer is 20 minutes”
But is it? If everything goes against him, the traffic is heavy, he gets stuck behind a learner driver, he hits all the reds lights and so on, he will take longer than 20 minutes - let’s say 25 minutes. On the other hand if everything goes his way, traffic is light, he gets all green lights. then he takes 15 minutes.
He should therefore amend his statement:
“The answer is 20 +/- 5 minutes”
But is it? We said earlier some things are more likely than others. In a hundred journeys Jack’s journey time will probably only be affected by common causes of variation like those described above, but what if a lorry shed its load and blocked the road with the diversion taking an additional 12 minutes more, or his car was hit by that lighting we mentioned earlier and broke down. Now that would change completely how we would describe the variation, which 'normally' would be only 5 minutes either way. If you take that thinking to its extreme what if an earthquake swallowed him and his car! What if Aliens attacked, what if, what if, what if? Realism and common sense must now come to the fore. Firstly we may not even be able to compile or comprehend the finite list of what could go wrong to Jack’s journey. Secondly to take into account all those ‘unlikely events’ would mean Jack would never come to a conclusion as to how long he should allow for his journey to arrive on time.
We could try to quantify in some way the likelihood of each of those causes of variation and calculate the odds for each possible or imagined event on our extensive list. Alien attack - where would you start with that one?! But be warned you could be embarking upon a lifetime’s work, and by the time you complete it, cars would probably be obsolete. It is better to look at it from another perspective. What if you could capture the things that are likely to happen most of the time (common causes) and leave out the things that are much more unlikely (special causes) but still account for the fact that they could happen in extreme circumstances? But where do you draw the line?
The challenge, as we have said, is knowing all the possible causes of variation. Surely if Jack is planning his journey to work, what he wants is to know how long does it take you to get to work on average and what allowance should he make for common causes. Let’s just do a simple experiment. If Jack makes a note of the length of his journey to work for the first 10 weeks, that will be 50 sets of data - the times it actually took him. If you then create a chart of those times you get a graphical view of the spread.
We can see from this that there is a balance to this data, it looks pretty much what we would expect. We can see that Jack's journey had 8 occurrences of the actual 20 minutes but only one occurrence each at the extremes of 5 minutes. But what if one of those "special causes" like a broken down bus had happened in those 50 journeys?
We can see now that though the mass of the data looks "normal" we have this one outcast, representing a special cause. I have drawn some basic graphics for you but they are only visual representations of a statistical tool that allows us to calculate what is the average journey time, and we can go on to calculate ‘standard deviation,’ represented by the Greek letter sigma σ , which in simplistic terms is a calculation that can be used to quantify the proportional spread in a "normal" set of data and match them to a percentage likelihood. For example if we had measured 100 journeys 68 of them would occur within +/- 1 standard deviation, 95 of the 100 in +/- 2 standard deviation.
So now we have a way of describing what we are expecting as normal and even to what extent we wish to allow less common events. This can be justified as a % of confidence in our assumptions. If we also measured more than 50 we may start to see other less frequent events which are still part of a normal distribution on a chart expanded beyond our +/- 5 minutes
From this Jack can amend his statement:
“The answer is 20 +/- 4.5 minutes with 95% confidence”
Of course the original +/- 5 minutes was an assumption, now we have actual data and the maths to calculate it properly we could make a much more defined statement with the true average of the data as the time to work and the +/- being a calculated number rather than my rough example of 4.5 minutes 95% of the time. That number will run into some decimal places of minutes and will be close but not exactly equal to his shortest or longest journey times. Why did I choose 95% in my last example and not 99.7%? most calibration certificates and common use of this model settle on 95% of the time as a quoted uncertainty value.
Let’s hope Jack gets to work on time despite the uncertainty!
If you want to know more than an introduction, please contact and ask me about our short courses through linked in or via our web page www.coventry.ac.uk/metrology
Process Engineer & Metrologist at Airbus Group
8 年Good article Ian, well explained I liked the explanation of uncertainty in such an every day activity as travel time.
Uncertainty Expert at ISOBudgets LLC
9 年Hi Ian, Good post. I write a lot about measurement uncertainty, specifically for metrology. Check out some of my articles here: https://www.isobudgets.com/blog/ I would like to get your feedback, plus some input from you. I am writing an article on measurement uncertainty where I am getting input and perspectives from 25 influencers in the metrology industry.
3D Tech Evangelist | SuperScanMan | Saving the day with Industrial CT Scanning Analysis | 3D Storytelling | Author & Speaker | Advocate for Smarter, Faster, Better Manufacturing
9 年Really great article Ian. Very clear. Thanks for sharing!
Retired !!
9 年Very nice explanation, Ian