Bayes'? Rule and Terrorist Videos
Rev Thomas Bayes, Inventor of Bayesian Rules of Inference

Bayes' Rule and Terrorist Videos

In the wake of the Christchurch massacre, there is considerable discussion of the merits, or otherwise, of Artificial Intelligence (AI) in the automatic filtering of objectionable content.

Personally, I do not think that these methods have a high probability of success. The reason for my viewpoint has to do with my understanding of Bayesian Probability.

One might think, at the outset, that because AI methods promise a "quantum leap" in the level of computational intelligence, that the prospect of automatic content filtering ought to be a shoe-in. Surely if computers can reliably recognize speech and perform automatic translation, it ought to be easy to detect fake news, hate speech and terrorist videos.

Never say never, but I think the prospect is unlikely.

To understand why, I think it is helpful to recognize that AI is fundamentally about a form of reasoning which is based upon probabilistic inference, a form of pattern recognition.

In simple terms, if a certain pattern is commonplace, for a certain class of situation, then the presence of the pattern allows one to infer the presence of a category.

No alt text provided for this image

If you see the distinctive pattern of a cheetah then you exclaim "Cheetah!".

So far, so good. So easy.

Or is it?

Ask yourself why a cheetah has such a distinctive pattern, up close. Why indeed?

No alt text provided for this image

Answer... Camouflage. Up close the presence of a cheetah is obvious. In among the tall grasses of the Savannah the cheetah all but disappears entirely.

This is because the particular shades, and mottled nature of the coat, accurately mimics the play of sunlight reflecting from a field of swaying grass. You simply would not spot it.

No alt text provided for this image

Are humans any good at camouflage? Sure thing... in any adversarial contest:

  1. War
  2. Trading
  3. Social Media

If you want to earn extra coin on YouTube just splice together a few old movie trailers for a new reel of upcoming attractions. Splice something real at the front, put in a catchy title, and soon you have an entire fake movie trailer business paid for by advertising.

Are people doing this? Sure... you don't spend much time on YouTube do you?

Does extra intelligence - meaning computation - help here?

In my opinion, maybe. It is a big maybe that depends on the rat cunning of your opponent.

As a seasoned trader, I rate my own rat cunning and that of my opponents.

They are hard to beat. The best way, in my experience, is to get them to buy drinks :-)

So... if raw intelligence or computational power is not the secret then what is?

Knowing the odds, is what I say... which is where Bayesian probability comes in.

This is an area where I feel well qualified...

I did my PhD in Theoretical Physics extending Bayesian reasoning to Quantum Mechanics.

That was my baby, although for some reason it is now called QBism

(Reference 75 of that Wiki is me even though I did it first)

Enough of the self-promotion :-)

I will argue that a deeper understanding of Bayesian Probability (and its relationship to the subject of reliable (or unreliable) communication via Information Theory) is essential to achieve any kind of business cut-through to the real promise of AI.

The simple fact, as shown by our camouflage examples, is that any sensory intelligence is somewhat hostage to the distribution of background noise if it has any chance of making an accurate categorical assignment. In simple business settings these are Yes/No questions:

  1. Will this client default on a loan or not?
  2. Can I trust this employee with privileged information?
  3. Should I accept this online date offer with someone I don't previously know?

You get the picture. Simple questions with low probability bad outcomes of high risk.

Bayesian probability provides a mathematically sound framework for reasoning about imprecisely quantified "facts" - meaning those things which are properly inferences of what might be true on the basis of the observation of the other things that we do know.

The distribution of relations between the things we know and the things we would like to know but do not actually know is the key to how well we can know anything.

While many would contend that we are living through a "digital revolution" I am inclined to think differently about where the minds of humanity are at right now.

Personally, I think that the Western World is going through something of a philosophical crisis of identity about what we can actually know for a fact. This may sound deep, but I do not think it is that deep at all. It is really a statement about whether we view a truth as a mathematical certainty (probability one) or merely a high conviction (probability 90%+). Traditional logic, with black and white categories, may lead us to false convictions on uncertain data. Ask any doctor what I mean, in the context of diagnosing disease, and they will tell you. Pretty much every mole looks bad, but only some are melanomas.

Not every philosophical tradition emphasizes the black or white of Cartesian logic. In the Eastern tradition, where education was primarily in the liberal arts and government, the Confucian scholar was ever reminded that black is white - coexistent in the differing and ever changing social and political viewpoints of the administered or governed.

No alt text provided for this image


Opposites can co-exist, in such a philosophical frame, because they are properly points of view referred to possibly different reference frames of experience and psychology.

Is there one shared experience for all sentient beings?

No, of course not.

Is there one shared set of Natural physical law?

Yes, it would seem so.

The distinction between these different qualities of confidence in human affairs is made rather clear in the honorary coat-of-arms of legendary physicist Niels Bohr.

No alt text provided for this image


At the heart of his chosen insignia is the Yin-Yang symbol of eastern philosophy. Bohr was not the inventor of the "quantum" as a concept in physics, but he promoted a much deeper understanding of how much we can know in the Natural world. Quantum mechanics limits the process of observation itself, such that the mere act of observing a physical system can change its behavior. This was a new idea to those trained in Cartesian rationalism.

How do we move from here back to Bayesian probability and reliable inference?

Bohr was fond of aphorisms... my favorite is:

“You can recognize a small truth because its opposite is a falsehood. The opposite of a great truth is another truth.”

In short, most "truths" in society are small truths. They do have opposites, but we rarely have full confidence that they are true. On the contrary, great truths are those beliefs in which the opposite may well be as enlightening. There are few of these, and they have a tendency to be abstract - such as "Do two parallel lines ever meet?" The answer was "no" to Isaac Newton and "yes" to Albert Einstein. Each is a great truth of geometry. As it happens, it would seem that our world, the real world, says "yes" to Einstein and "no" to Newton.

When you make this small change in grounding assumption, you get an entirely different world view. Such situations are inherently interesting, but rarely of practical import.

The real social impact of the revolution I am alluding to, in probabilistic logic, comes when we realize the real circumstance of real decisions based on uncertain data.

In the real world, decisions based on evidence have naturally those borderline cases. When the jury is out on a murder trial, will the decision be "guilty" or "not guilty"? The practice of law recognizes shades of grey - "guilty beyond reasonable doubt" is the jury instruction.

These are not the mathematically precise rules of Boolean Logic. They are properly the weighing of evidence according to known, assumed, or perceived tendencies.

Hence the conduct of such reasoning is ever prone to doubt, and the magnitude of the doubt may not be fully quantifiable. This is not a limitation of intelligence. When understood in the proper manner, it is an essential limitation of the power of evidence.

If there is insufficient evidence in a murder trial the jury will properly acquit the defendant.

That may seem entirely unjust, if the defendant actually did commit the crime, but it is an acknowledgment of the limitations of evidence and the error of false conviction.

In short, practical wisdom is inherently about judgment under uncertainty.

Of course, decision making in business is judgmental. When a customer applies for a loan they expect to be told "yes" or "no" to the loan application. If you are 95% confident they will not default, the solution is not to offer them 95% of the requested loan principal.

Either you make a loan or you don't make a loan.

Which brings us to our main theme...

When somebody somewhere starts a live stream on Facebook should you let them?

It's a vexing question and I am glad (personally) that I do not ever need to answer it. In my view, and I will explain why in a moment, the right question is "Should you live stream?". Would it cause great social harm, if live streams were delayed transmissions?

I won't answer that, but leave it to the reader to make up their own mind.

The question I address is different, and framed on the assumption that someone, somewhere, considered it a wise business decision to allow live streaming, anywhere, anytime and by literally anyone, on this crazy spinning ball called Planet Earth.

Let's assume that position is non-negotiable.

Our glorious leader has decreed that "live streaming" is non-negotiable: just deal with it.

Now suppose we have an infinite amount of computing power, and a host of really brilliant software engineers. They have the best kit ever and live data to play with!

Ask now a different, and I maintain, more pertinent question.

How good do engineers need to be to tell the difference between an ordinary garden variety Hollywood action movie, first-person shooter video game, legitimate news report from a war zone, or a piece of manufactured propaganda, hate speech or terrorist video?

We might think, given all the progress on human language processing, that this ought to be a simple problem. After all, screening for porn videos is fairly effective, right?

Remember that social truths are somewhat gray. There is a tiny proportion of persons on Planet Earth who think that racially or politically motivated violence is okay. Mostly we would call them criminals, or in the more heinous cases - terrorists. If you happen to be a terrorist, then being a terrorist is okay by you, and you are gonna be a terrorist.

However, the number of terrorists is a pretty small proportion of the human race.

In probability theory, we call this the base rate effect.

When you are considering phenomena that are very rare, like certain forms of cancer, then the probability of showing a false positive can be very high even for an accurate test. The basic issue in practical business decision making is this:

The rarer a category may be, among a natural population, the lower the base rate of occurrence is. The lower the base rate, the higher the discriminatory accuracy needed to avoid false positives.

That seems to be little defeatist, but it is the mathematics, pure and simple.

I daresay most readers are already there with my key point, but here is the math.

In Bayesian probability theory, the key idea is to represent probabilistic reasoning via a conditional probability. This is the concept that we consider a Conclusion to be arrived at one the basis of Evidence. Uncertain conclusions are stated as probabilities:

No alt text provided for this image

In words, this is the probability we attach (degree of belief) to the Conclusion given the presented Evidence. Of course, we are uncertain. Therefore, we accept that:

No alt text provided for this image

The Evidence is the thing we choose to accept. However, clearly on the basis of Evidence with only an indirect relationship to our conclusion, we could well be wrong.

There is always an element of reasonable doubt.

How does this interact with the natural distribution of evidence by category?

Here we get to the heart of how present "AI" systems actually work. They are based on the statistical correspondence between humanly recognized Categories (Cheetahs) and the statistical occurrence of Features (Spots). Observing spots on a live cat makes us more confident that we are looking at a Cheetah, or maybe an Ocelot ... or ...

Bayes' rule powers our confidence in such reasoning. Mathematically, we say:

No alt text provided for this image

That mathematical rule is simply a tautology. It says that we have two ways to calculate the same thing, namely the joint probability of a Category (Cheetahs) in a population of objects with a given observable Feature (Spots):

No alt text provided for this image

So far so good. We have said the same thing two different ways. This is useful since in real world all we see are objects with Spots, and we have to decide if they are Cheetahs. This is where the rule of Bayes' finds practical application.

Abracadabra, shazaamm... by the power of high-school Algebra we find:

No alt text provided for this image

If we know how common the Feature is for a Category, and we know how likely the Feature is, and the Category, then we can work out how likely is the Category given the Feature.

Or, in simple terms, if we see an animal with spots: "Is it a Cheetah?"

This is pretty much how machine learning operates to achieve discriminatory intelligence. Take a bunch of photos of animals in Africa and label them Cheetahs or not Cheetahs. Then have the algorithm work out the balance of probabilities that any given animal is a Cheetah.

Where can it all go wrong?

Now that we have some math, it ought to be obvious. We could assume that all Cheetahs have spots. Oops... there are actually ... out there... genuine spotless Cheetahs:

No alt text provided for this image

That would be what you call a false negative. Spots, no. Cheetah, yes.

Then there are Ocelots (obvious) and actresses in movies with spots (Cameron Diaz):

No alt text provided for this image

That would be a false positive. Spots, yes. Cheetah, no.

Of course, we are oversimplifying. The probability of meeting Cameron Diaz, in full movie makeup, on the African Savannah is low, but not zero. You get the picture.

Thankfully, the occurrence of such movies is low.

But what about filmed gun violence? As a Hollywood genre, terrorist plots are not uncommon, and filmed gun violence is rampant.

How is the robot supposed to tell the difference between them?

How hard might that be given that Hollywood employs stylized violence to simulate reality?

Consider further a legitimate news reel from a war zone ...

Back to the math...

The problem is that we have certain features, Guns and Violence, in the category "Entertainment" and others in the category "Terrorist Video".

If the object of the terrorist video is widespread viewership then will it not attempt camouflage to achieve widespread distribution, in original, or altered form?

Suppose we have the feature Guns in Video (Guns) with the binary category Terrorist Video (TV) or Not Terrorist Video (Not TV). The Bayesian math tells us that:

No alt text provided for this image

Here we are aiming to decide if the video is not a terrorist video given the guns and whatever else we saw in the video, versus the alternate conclusion (block it).

The probability of the alternative conclusion is:

No alt text provided for this image

Notice that the probability of guns appearing in a video, in general, is common to both expressions. Further, the likelihood of Guns in a terrorist video or a non terrorist video about terrorists is probably about equally high. The only discriminator is the base rate.

This is a mathematical trick, but a nice way to see the issue is to set a design target for the discriminator system: let's see what happens if the target is a 50/50 decision engine:

No alt text provided for this image

We set them both equal to see the consequences for the other relationships:

No alt text provided for this image

Since we arbitrarily set the two conclusions to be equally likely, this is properly a statement of how informative how chosen feature "Guns" needs to be to tell the two cases apart. You can see that, mathematically, it is the ratio of the base rates.

In simple, terms if actual terrorist videos with guns in them are 10,000 times less common than garden variety entertainment videos with guns, bodies and deranged voices, then we will need our "feature detector" to include features that are 10,000 times more likely to be present in a real terrorist video than anything the entertainment industry serves up.

Call me a pessimist, but I think that is optimistic.

Remember the lesson of the cheetah in the long grass.

The actual real terrorist simply wants to commit a real crime and then have it viewed by as many unsuspecting people as possible. What better way than to make it look like the video is entertainment and not an actual real piece of live footage?

Bang... there is a shock! I just sat through a real terrorist video.

Do things like that already happen? Well, just go read on the Internet.

I hope this article may go some way to explain why AI should properly be assessed as a new way to perform an old, and necessarily fraught, social task: the making of wise decisions.

Wherever doubt, uncertainty and an adversarial element of deception are present there is no guarantee that intelligence alone will deliver confidence. Some things are just uncertain.

Sara Gana

LinkedIn Rebel ?? Visionary Leader ?? G-TRADE & 6GIR – UCO (fintech) | 6GIR BIOME Unity Bridge Pilot (telecom infra) | BIOME Entertainment City ??Chairwoman, Cofounder, Group CEO | AI-Human-Machine Evolution | GCIGS

5 年

very interesting explanation and you conclusion is indeed wise insightful. yes intelligence alone do not deliver confidence or anything. there is some other guiding force /cosmic pulsation that prevails those uncertainties if we research further. And whether it is AI or any new trend should be properly assessed without jumping into conclusions. ?

要查看或添加评论,请登录

Kingsley J.的更多文章

  • ゴッドハンドニッパーへの賛歌

    ゴッドハンドニッパーへの賛歌

    これは私のニッパー、多くのものがある しかし、この一つは私のもの、私の頼りになる道具 私はそれを使いこなさなければならない、私のガンプラキットを作るために ナブを切り取り、白いストレスマークなしで滑らかな部品を作るために…

  • Ode to a GodHand Nipper

    Ode to a GodHand Nipper

    This is my nipper, there are many like it But this one's mine, my trusty tool, my bit I must master it as I build my…

  • The Origins of Neologistic Nihilism

    The Origins of Neologistic Nihilism

    Neologistic Nihilism is a radical philosophy that emerged from the introspective musings of a young man from East…

    2 条评论
  • Time to Pay the Dudelsack Pfeiffer

    Time to Pay the Dudelsack Pfeiffer

    In the summer solstice of 1998, Herr Yankovic was crowned the winner of the Medal of St Gallen for the third…

  • The Peculiar Tale of Madonna and Moon Dust

    The Peculiar Tale of Madonna and Moon Dust

    Good afternoon, I am Philip Ball, a science correspondent for Nature, and today we will be discussing the fascinating…

    5 条评论
  • Authentic Stories for Investment

    Authentic Stories for Investment

    There is a paradox which lies at the heart of the investment management industry. On the one hand, our industry demands…

  • The ViralMath.org Mission

    The ViralMath.org Mission

    Welcome to ViralMath.org with this post describing our mission to help fight the deadly COVID-19 pandemic through…

    1 条评论
  • Gompertz Growth and COVID-19

    Gompertz Growth and COVID-19

    Okay..

  • R&D Publishing Post Peer-Review

    R&D Publishing Post Peer-Review

    Having played around a bit with blogging and other forms of digital communications, I have always come away a little…

    2 条评论
  • Spark Windows Temp Cleanup

    Spark Windows Temp Cleanup

    One of the more frustrating features of today is open source projects that pay zero attention to the issues that can…

社区洞察

其他会员也浏览了