Killing an Ant with HAL 9000

Killing an Ant with HAL 9000

I hear a lot of pitches from firms selling AI solutions, from two-man operations all the way up to huge players like Microsoft and Google. One thing that has struck me, especially in the last few weeks, is that a lot of them are using AI to solve problems that clever algorithms can (and often already have) solved years ago.

A Regular Expression of Interest

AI-based tools bring a lot to the table, but at their heart, they create a repeatable process for doing something. The model or network or ruleset they create is a function that takes in some parameters and produces an output. When we know a relationship exists but we don't know what it is, or how to express it mathematically, that is a godsend. Especially in areas where machine learning techniques traditionally excel, such as classification, astound results can be achieved.

I am seeing AI used on solved problems, though. Names and products changed to protect the misguided but innocent. Acme.ai brought to me a tool that searched extracted text for certain types of information. One of the centers of their sales pitch was the ability to find North American phone numbers. They explained to me they can find numbers expressed as 7-digits, 10 digits, or even 11 digits (including the +1 country code.) I relatively quickly realized that all of the information types they could detect we things that are well-structured when they appear in documents. Phone numbers, ZIP codes, driver's license numbers, etc. All things that could be searched for using relatively basic regular expressions.

I dug in on this, because I was certain they weren't just trying to sell me something I could do with a IBM 7000-series from 1968. As I dug in, I learned these well-meaning chaps were indeed tokenizing their documents and had, at great effort, meticulously trained models up to 99.9% accuracy for each category of data they were searching for. They were very good classifiers. But they were an early 21st Century slegehammer being swung at a mid-20th Century ant.

It's Easy to Train on a Math Problem

There's a growing population of companies and teams out there that are building AI systems based around problems that have more traditional solutions. In many cases, it is because they had traditional solutions these problems are being chosen. Training ML models takes vast quantities of data to use as examples, and data is painful and expensive to collect, clean, and utilize in bulk. Choosing a problem that you can compute independently takes this problem away. If I want to train an AI to tell me if a work is seven characters long, I can create huge amounts of data very easily to train and test on, and I can quickly train that model up to dazzling levels of accuracy. I can say on the side of my product that it is AI-enabled. At the end of the day, though, I have just created a string length function with a surprisingly high error rate.

Butter Knives for Butter, Chain Saws for Oak Trees

There seems to be a lot less pure snake oil in the AI industry than there was five years ago, but as the tools have gotten into the hands of more and more entrepreneurs and amateur developers, we're seeing more and more naive uses of AI for problems that are just as well, or better, solved with traditional data processing techniques. Machine learning and AI opens up new horizons in computation, but when we're solving a new challenge as developers, we should always be stopping to ask if a simpler, cheaper, and faster solution is available with our older tools.

(Image by Tom Cowap)

Michael Sarlo, EnCE

Chief Innovation Officer & President of Global Investigations and Cyber Incident Response Services at HaystackID

1 年

Truly insightful, John Brewer, and very true! Knowing what tools and strategies to use to overcome data centric challenges, even if those tools use algorithms we’ve seen since the dawn of modern computing, is always going to be more effect than just throwing snazzy tech at a problem. The human approach is critical!

要查看或添加评论,请登录

社区洞察

其他会员也浏览了