Artificial Intelligence #26: What could the next decade of AI look like?
Image source: University of Oxford

Artificial Intelligence #26: What could the next decade of AI look like?


It's autumn here in the UK. Not long before, I start teaching the two courses - Digital Twins: Enhancing Model-based Design with AR, VR and MR, and Artificial Intelligence: Cloud and Edge Implementations.

No alt text provided for this image

Image source: University of Oxford?

This week, I also shared our vision on #digitaltwins at the MathWorks site Digital Twins and the Evolution of Model-based Design.

Last week, I discussed my misgivings about the idea of data-driven v.s. model-driven AI. This week, we extrapolate that question and ask a much broader question: ?“Will AI breakthroughs this decade be based on last decade’s AI?developments, OR will we see new directions of AI research in this decade?”

Undoubtedly, the last decade 2010 to 2020, was a game-changer for AI research. It was based on the current deep learning models characterised by parallelized networks comprising relatively simple neurons that use non-linear activation by adjusting the strengths of their connections. Using this model, we find the rules for function f(x), which maps domains x to y when the rules are hierarchical and (typically) the data is unstructured. To do this, we need a large number of labeled examples. The labels are at a higher level of abstraction (ex if the image is a cat or a dog). The algorithm can then discern the features that comprise the object(ex a cat has fur, whiskers, etc.). This is the essence of deep learning called representation learning and is common knowledge for data scientists.

Especially in the latter part of the last decade, three developments have accelerated this model:

a)????Transformer based models

b)????Reinforcement learning and

c)????Generative adversarial networks.

And they continue to amaze us daily – for example:

Deepmind - Alphafold – deep mind protein folding; Deepmind - meta-algorithm creating the one algorithm to rule them all, i.e., a deep learning model that can learn how to emulate any algorithm, generating an algorithm-equivalent model that can work with real-world data. And of course, Deepmind’s AI predicts almost precisely when and where it is going to rain, and now megatron Turing NLG from Nvidia and Microsoft – a large language model that claims to exceed GPT-3

Impressive as these are, all the examples above share some crucial properties

  • They suit narrowly defined problems.
  • You can generate your own data (ex alphaGO) with practically endless experimentation
  • The cost of experimentation is low.
  • There is plenty of data available ex images, text, speech, etc
  • The models are black box
  • Risk of failure is low (ex: advertising)

?Also, the common element here is: all intelligence (rules) are derived from the data alone.

?The other extreme is: rules are symbolic, i.e., decided by humans.

?That was the early days of AI which ultimately led to the AI winter.

?However, I believe that this decade will be all about techniques that can interject expert intelligence into the algorithm (but is not the same as symbolic as in the early days of AI)

?One such interesting case is from some work by Max Welling. I have referred to this work before, but in this case, I am referring to the use of generative models to build counterfactual worlds. This could work where we have a problem domain with too many exceptions, i.e., there is a very long tail of situations that do not show up in your dataset used to model the problem.

To put the problem in context

According to Max Welling

  • As per the bias-variance trade-off, you do not need to impose a lot of human-generated inductive bias on your model when you have sufficient data. Instead, you can "let the data speak."
  • However, when you do not have sufficient data available, you will need to use human-knowledge to fill the gaps.
  • Most existing techniques work well with interpolation problems, i.e. when estimating data points within the known data points. However, they do not work when we need to extrapolate, i.e., you enter a new input domain where you have very sparse data, and your trained model will start to fail.
  • If the input domain is very simple, you can easily build an almost perfect simulator. Then you can create an infinite amount of data.
  • But that approach does not work for complex domains where there are too many exceptions.

The above defines the problem, i.e., how to model a world for which you have very little information.

?Some definitions

  • You need to understand the differences between discriminative vs. generative models.
  • Also, you need to understand counterfactuals. Counterfactual conditionals are conditional sentences that discuss what would have been true under different circumstances, i.e., paths that might have been
  • Inverse generative models are models that ask the question: "What are the possible behaviors that can generate the aggregate dynamics? Unfortunately, I could find very few links to explain this – but THIS link is good.

No alt text provided for this image

Image source: pixabay showing paths that might have been (counterfactuals)

?

What is proposed

According to Max Welling

The basic idea is to use Generative models to build counterfactual worlds.?

The rationale

  • The world operates in the “forward, generative, causal direction."
  • We need relatively few parameters to describe this world: the laws of physics are surprisingly compact to encode.
  • This world is organized modularly, consisting of factors and actors that can be approximately modelled independently.
  • In this world, one event causes another event according to the stable laws of physics.

How it could work

As I understand it, what is being proposed by Max Welling is:

  • Use generative models to build counterfactual worlds
  • Generative models are far better in generalization to new unseen domains.
  • Causality allows us to transfer predictors from one domain to the next quickly. For example, accidents are correlated with black cars in the Netherlands but perhaps with red cars in the US. Using color as a predictor does not generalize, but a causal factor such as male testosterone levels will generalize very well to predict accidents.
  • Generative models allow us to learn from a single example because we can embed that example in an ocean of background knowledge.

Relationship to human thinking

  • Humans have a remarkable ability to simulate counterfactual worlds that will never be but can exist in our minds.
  • We can imagine the consequences of our actions by simulating the world as it unfolds under that action.
  • We can also deduce what caused current events by simulating possible worlds that may have led to it.
  • This ability depends on our intuitive understanding of physics and/or psychology.
  • When you need to generalize to new domains, i.e., extrapolate away from the data, you use the generative model.
  • As you are collecting more (labeled) data in the new domain, you can slowly replace the inverse generative model with a discriminative model.

?

Analysis

  • I like this approach from Max Welling.
  • I discuss it here to show how complex model-based thinking could be and the dichotomy between data-driven v.s. model driven will limit our thinking
  • It is important to note that this is not symbolic thinking of the early days of AI (humans set the rules/ expert-based systems)
  • Also, we are not saying that data-driven approaches have reached the limit – on the contrary, as my examples show, the developments from RL, large language models, and others have just started.
  • ?But I am saying that the next decade could be about incorporating human insights into AI models. This is useful because AI could apply to a class of problems that are not being considered currently.
  • Finally, even if you are a practising data scientist, these ideas are not your usual work scope. But they may well be soon.?

If you want to work with me, see my courses at the #universityofoxford

·??????Digital Twins: Enhancing Model-based Design with AR, VR and MR and

·??????Artificial Intelligence: Cloud and Edge Implementations.

?

?

Gary Brooks

CTO at iProtectU

3 年

Really interesting. I see AI starting to use a range of techniques that may come from multiple sources (directions may be a better word). One will be the use of models or tools, so whilst you could teach an ANN to add up, it is more simple to teach it to use a calculator. Certain parts of image recognition can be passed to specialist tools. Eventually these tools will be themselves generated by an AI, the output may or may not be a "neuron" architecture. Another directions will be the use of AI to generalise - which has direct links to the above discussion of counter factuals. Whilst I suspect that the generalisation will itself produce benefits, it is likely, in my opinion, that this will be driven by the need to understand decisions coming out of an AI, but an effect of this will facilitate the next stage that I foresee. Which is an ecosystem of partially independent ANNs connected through "generalisation trunks", with the ability to plug-in tools to solve specific part of a problem.

Dr. PG Madhavan

Digital Twin maker: Causality & Data Science --> TwinARC - the "INSIGHT Digital Twin"!

3 年

Ajit, a very thoughtful summary on your part . . . I agree 100% with Max Welling Problem in Context section! "However, when you do not have sufficient data available, you will need to use human-knowledge to fill the gaps" - this is where Physics-based models can give a leg up. I know your focus here is not on IoT, we have the opposite problem - we have surfeit of data! When there is lots of data, using models as *initial conditions* is a good idea; knowing full well that all models are approximations. There is a latent feeling that since Physics-based models use partial and non-linear differential equations, etc., they are accurate; but these equations are approximations of reality with drastic simplifying assumptions (which almost never hold in practice). Coming to Counterfactuals, the advantage is that AFTER data are collected, you can do what-if analysis without doing more experiments - this is what got the Nobel prize this year for Imbens and cohorts! Of course, counterfactuals do not exists without knowing causal effect. Sorry to insert my recent work here but the following may illuminate some of your points further in an IoT context. ?? “Causality & Counterfactuals – Role in IoT Digital Twin”; https://www.dhirubhai.net/pulse/causality-counterfactuals-role-iot-digital-twin-dr-pg-madhavan

Olaf de Leeuw

Data Scientist - Dataworkz

3 年

Interesting read again. I'm looking forward to your newsletter every week. I would like to share another possible answer to the main question of this week: “Will AI breakthroughs this decade be based on last decade’s AI?developments, OR will we see new directions of AI research in this decade?” An interesting topic is the research on Spiking Neural Networks. All deep learning solutions that have been developed so far are really amazing but they do have another thing in common. It takes a lot of energy to run all computations due to their continuous nature. For example during training. At the CWI researchers are now looking at the possibilities of Spiking Neural Networks which are much closer to the way our human brain works. At this moment they have built SNNs of which the performance (in terms of accuracy) approaches the performance of the best ANNs while the SNNs are much more energy-efficient. This is promising and can be very useful in for instance always-on devices. https://www.cwi.nl/news/2021/energy-efficient-ai-detects-heart-defects

Khurram Shehzad Quraishi (PhD,ChE,PE,LA,Cord ILO,DChE)

|Energy, Environment & Sustainability|Biochemical & Bioprocess Engg|CO2 Capturing|LCA|ESG|ISO-9001, 14001, 45001, 27001|QHSE|

3 年

Dr. MARYAM ZAFFAR

Md. Ashikur Rahman

Digital Marketer & Graphic Designer, YouTube Expert, SEO Professional at Fiverr, Upwork

3 年

Good

要查看或添加评论,请登录

Ajit Jaokar的更多文章

社区洞察

其他会员也浏览了