The one tip to decrease AI agent complexity and increase explainability
AI is getting?more?powerful every day, and we witness its presence in our daily lives?solving tougher and tougher?business and consumer problems.?If with great power comes great responsibility, then?with great?AI?capabilities comes?great complexity.
In turn,?complexity breeds increased training?costs?and challenges?in?designing?effective AI architectures.?As the models?get more complex,?they require more advanced?AI design skills and more training data.
In addition, as AI models?become more complex, they become harder and harder?to?explain. This?increases?the black box effect, where it is virtually impossible for the user?or?the developer to know how the?model?reaches?its conclusions.
To help reduce AI complexity and increase its explainability, our teams of experts from different domains work with the customers. We?help?them limit?the?AI scope to its minimal necessary scope, a kind of "Minimal Viable Product" agile engineering?strategy,?and?then?use other techniques upstream of the AI?model?using inputs?preprocessing.
The AI?conundrum:?Abstracting complexity through brute force
Deep Neural Networks (DNN), i.e., AI in the context of this article, can model complex non-linear relationships by learning what output should happen?for a given input. With?both?advances in DNN architecture?capabilities?(from Positron to Deep Convolutional, to LSTM, to Transformers, and more) and?an?increase in?cloud-computing power, it is possible to?consistently?go?higher in abstraction levels by outsourcing this complexity to a more complex neural network.
For instance, let's imagine you want to?teach an AI how to play foosball. You could?feed the?raw images of the 22 players and use this as input for a reinforcement learning model.?You would?abstract all the elements relative to?the individual?players and ball positions, speed, and acceleration.?The first layers of the DNN would automatically extract those?characteristics from the images captured.
Similarly, if you check out?Tesla's AI Day?from August 2021,?Tesla engineers built an AI that creates?a 3D model of the road and its?obstacles by directly feeding?the raw images from the car's?six cameras?into the?network.
However,?this "brute force" approach is not necessarily the best?in many business scenarios. It requires?more computing resources, training data, and advanced?DNN architectures. It also increases?the black-box effect of AI, where no one?can predict?how the AI makes its decisions.
How?do?we?circumvent this, when possible?
Designing?pre-processor models to build better DRL-trained AI agents?
Although, during its AI day,?Tesla explained why its approach made?total?sense?in its use case,?often, there?is a?way to de-complexify an AI agent.
For instance, in the foosball example, one could imagine building?a preprocessing stage (AI or simple image segmentation) to extract each player's?(which is only the position of 4 bars with 1 to 4 players?on each)?and?the ball's?position.?A simple piece of preprocessing code between the image processing and the DNN could then compute?speed?and acceleration?for each.?With this approach,?the?foosball playing-AI inputs?will only?be going?to be 9 x?6?variables: 4 bars per side?plus?one ball, times?all of their respective positions?(x,y for the ball?or, for the players,?y and the rotational angle), speed?(ditto)?and?acceleration?(ditto). One could even argue that from a DNN standpoint, as all players in a bar have the same angle, you could potentially remove the concept of player and only keep the concept of the bar.
This approach immediately reduces the complexity from?a 1080×1920 image, about 2 million DNN inputs (four times more for a 4K image),?to?only 36?inputs. And this only accounts for the input layer. The lack of AI-based image processing will also significantly reduces the number of DNN hidden layers.
领英推荐
One can?easily?replicate?this?approach?in many processes to simplify?the?AI agent architecture and training?complexity.?Let's take another more business-friendly example: how to optimize?aircraft landing patterns to support air traffic controllers. Like the foosball example, one could imagine?feeding a raw radar image and building?a massive DNN to propose?landing options?to the aircraft controller.
However, this?may lead to?two main?issues. First?and foremost, creating a suitable dataset or, in the case of a Bonsai brain reinforcement learning-based training, an appropriate?simulator?would be extremely complex. It would require logging potentially thousands of hours of real-life images?and the human decisions taken for each image. It would not only be daunting, expensive, and?error-prone, but it would also be applicable for a single airport configuration. This approach can't scale across airports.?The second issue is that, because of the complexity of this AI agent, this AI "black box" (aircraft?pun intended) would be impossible to interpret for controllers. There would be no way?to understand why the AI advised one action over another.
The alternative is to decouple different elements of this problem and?preprocess the input to minimize the?complexity of the?AI agent.
In?this situation, for instance, one could?use?separate models?(ML or not) to:
Then, based on all these parallel assessments?and where a plane?is?in its approach, build a Project Bonsai brain for each separate phase:?queue management, runway assignment, and final landing directions.
Those?models could then be themselves supervised by a rule-based safety engine with hard-coded procedures that the brains,?i.e.,?AI agents, won't be able to overrule.
The explainability bonus
In addition to making?the?design and training of the AI agent easier and faster, input preprocessing has another side and non-negligible benefit. It allows?for?more focused?AI agents and enables?better explainability for each of them.
In the above example, if the AI agent only manages queues based on plane positions,?it will be more manageable to explain its behavior (and spot odd ones)?than if it were part of a massive all-inclusive one.
How to?start defining?the right inputs preprocessing strategies
Knowing what to preprocess and?embed in your AI agent is not a science;?it's an art. But it's an art that only?experts in Data Science, Machine Learning, and Deep Reinforcement Learning?master.
(This article was originally published on Neal Analytics blog)