Safety and AI
Imanges From: Wikimedia - creative commons.

Safety and AI

You got into a pre-WW1 flying machine at your own risk. Feats of technology that they were, they were also arbitrary killing machines, prone to sudden mechanical and structural failure. The folks that flew them were cut from the same cloth, after all, where could you get training when you were the first to try something?

AI twenty years ago had the same flavor - experts only, sudden unexplained failure. It really wasn't safe for work (despite the best efforts of everyone), Alvey and the 5th generation were confirmed failures and KBS were for the birds. When things went wrong everyone looked sad and moved to another project. Luckily AI people didn't work at height or with kinetic energy. But who knew that that AI planning was in general NP complete and that the heuristics of Graphplan would fail for this particular set of cases around the structure of ethernet networks? Who indeed? Little was learned because little was understood, but only those who were in the mix got hurt, civilians were not involved, and every smash up was an experience for all concerned.

Roll forwards to now, and what's changed?

We can do more, we can solve problems many orders of magnitude larger than before, and we've got deep networks and the data to use them. But like the airship builders of the 1930's we don't have the engineering practices or professionalism to deal with them. But, like the R11 and Hindenburg we're loading passengers onto our platforms who don't have a clue what they're getting into or where they're going. And some of the applications that people are building will make these people dependent on the technology that we are using.

Which would be fine if we used processes and people who were able to underwrite the technology in the way that people and processes evolved in WWII were able to underwrite the new systems of civil air transport in the fifties and sixties of the last century. When the wings started to come off Dehavilland Comets an intellectual and institutional system was available to step in and investigate (see : https://aerospaceengineeringblog.com/dehavilland-comet-crash/). I would like to ask all those who are using AI in safety critical applications, applications that effect people's health or prosperity, some questions.

  1. Can you definitively say where all the AI models are in your organisation?
  2. Can you prove and and articulate the design and implementation process of these models?
  3. Do you understand what the interactions between these models are?
  4. Is the evidence that you are using to make a claim of effectiveness for these models convincing?

Mature and sensible organisations will have good answers to 1 and 2 - although they may need to audit and assemble the practices of their engineers to get themselves into a position that would stand scrutiny in the event of an accident. But I believe strongly that the answers for 3 and 4 are always no. Understanding these interactions in organisational systems is an open question, and the way that we evaluate and demonstrate AI has been repeatedly found to be flawed. AI models that behave on carefully collected and constructed test sets almost never behave as well when they get into production, and the folk processes that are used to manage post production improvement are not models of science.

Until these things change the AI community is building airships, we've got a choice - either we change our practice and address these fundamental issues rapidly or we simply wait for the accident that stops everything in its tracks.
















Moritz Platt

Capital Markets Technology Manager at Google

5 年

Great to hear your talk touching on safe and verifiable AI at BCS yesterday. It will be interesting to see what laws and regulations will emerge with AI going mainstream.

回复
David Barnes

Award Winning Content Consultant at Quadmark

6 年

Great article. It does feel like the start of a revolution -- great progress, but a lot of destruction along the way. We will need clear regulation for safe and healthy AI systems (like we have in airplanes) as well as a code of ethics for AI practitioners, so everybody knows if they're approaching an ethical boundary.

回复
Nicolas Wykes

Deliver Connected Solutions for the Office of the CFO, Sales Lead Accounting Hub, Senior Account Manager,

6 年

Here are some ideas I think would help organisation address the 4 questions you mention in your article: Question 1 and 2 are closely linked to AnalyticOps concept.?https://bit.ly/2AxxhL4? With the metadata driven process, standardized ingestion templates, and customized data science packages in R/Python, these types of standardizations do not limit the data scientist but rather simply standardizes their workflow during/just prior to the operationalization steps in the CI pipeline.? For question 3 I would like to share this article:?https://arxiv.org/pdf/1807.07404.pdf? It’s an interesting approach to quantify the impact of single data point changes and how significantly these types of data points influence model outputs. By exposing the variance in machine learning models (in this case a recommender system suing word2vec), they highlight the fact that they are actively researching ways to go through the data and modelling for the sole purpose of identifying those key point (what they refer to in their publication as “information density”) locations for the purpose of feature selection to optimize model accuracy and impact. And Finally to address question 4: While typical measures reside in a training/test data set where a confusion matrix (i.e. true positives, false positives, etc) would be generated, appropriate simulation techniques also offer a way to perform “what-if” analyses, and in particular for supervised learning techniques, a constant feedback loop of scored observations (think here fraud detection and whether or not a triggered fraud event was actually fraudulent) provides an ability to understand a running Accuracy of the models in near real-time. I understand the subject is more complexe than those short answer but I hope you will find them interesting and give you food for thoughts.

回复
Daniel Gilks

Research Programme Lead - Networks Infrastructure | BT Accomplished Engineer

6 年

Great piece Simon. Important to see the transition from considering how systems enable AI, or even 'do some AI because it's exciting' . To a broader view of what a well engineered, secure and genuinely beneficial system which learns would look like.

要查看或添加评论,请登录

Simon Thompson的更多文章

社区洞察

其他会员也浏览了