Fundamental things about Artificial Intelligence every business leader needs to know.
Michael May
Technical due-diligence on pre-revenue companies. Technology scout for early-stage investors.
Artificial intelligence, robotics, and autonomous vehicles are everywhere in the news today. Some people make them out to be inevitable, if not miraculous, drivers of the coming economy. That’s not surprising given how they are marketed. If you’ve never worked on the technical aspects of these things, they can be confusing and intimidating, and the marketing folks exploit your confusion to make everything sound like the best thing since sliced bread. Let me help take some of the mystery out of these devices with a brief tutorial. To help you make choices about adopting this new technology, we’ll consider how things marketed as “AI-Enabled” or “Smart” work, and what their limitations are.
The vast majority of these smart things are built around a technical field called “pattern recognition” that is a combination of computer science and mathematics. Pattern recognition works by feeding a computer data about a thing(s) and asking the computer to guess about some attribute of the thing(s) that would be useful to you. The thing can be pretty much anything: a person, a group of people, a driving route, an investment strategy, a customer list, an airplane’s flap settings, you name it. The data can be pretty much anything too: color, weight, purchase cost, robot’s arm speed limit, time of day, account number, name, birthday, email address, you name it. The last part is the attribute you’d find useful, and you guessed it, it can be pretty much anything too: best person to date, quickest driving route that also saves gas, which neighborhood will have the most crime, purchasing pattern of an average customer, correct flap setting for landing the airplane, best marketing strategy to use to sell smart robots, etc.
Precisely how all this works is the subject of many academic computer science and math papers. When algorithms and computers are working well, they can do amazing things: recognize people in front of an autonomous car, improve your diet, or tell you something about your customers you never knew. However, there are certain things that can always go wrong because of the fundamental nature of pattern recognition, regardless of the particular way the technical folks implemented the algorithms that make any particular AI product work. Let’s take a closer look at these limitations so you can make better decisions about bringing these algorithms into your business.
First, the data. Obviously if your data is plain wrong then the pattern recognition algorithm has no hope to work. It is a garbage-in garbage-out process. However, the problem with data is more subtle than the input data just being plain wrong. Sometimes you want pattern recognition algorithms to make sense of humongous, messy data sets that a person can’t make sense of. That’s perfectly ok, but you need to ensure that the data is sufficient to answer the question you are asking. If the data is correct but not sufficient to answer the question, you run the risk of the algorithm making irrelevant conclusions. For example, assume you want to avoid traffic jams so you feed your autonomous car’s algorithm the average vehicle speeds on the possible routes ahead. Then, your algorithm steers you right into a traffic jam. This may very well happen if the traffic speed data was based on 15-minute averages and a new wreck was just 3 minutes old. In this case, the data was correct, but it had limitations that weren’t obvious: 15-minute averages can’t guarantee avoiding a wreck much less than 15 minutes ago. The lesson is to always ask questions about data your data to understand how it pertains to the question: How sure are you it can answer the question? Do you understand its limitations? Given the data, what could quickly tip you off that the algorithm’s guess about the attribute was wrong? If you don’t understand the technical aspects of the problem, get help.
Second, the thing about which you are guessing. Often but not always, the pattern recognition algorithms will take your data and the attribute you want to guess about, and then select a thing for you. When the thing to reason about seems simple and obvious, you can be a victim of your own confirmation bias or an algorithm that didn’t have enough data to point you to a less-obvious, but better, solution. In the other extreme, when the thing is not simple or obvious, the data may lead to guesses that are irrelevant to your actual problem.
As example of missing the less-obvious solution, let’s say you are lagging your competitors in luxury car sales so you feed an algorithm data about your sales staff, your customers, and your sales rates. The algorithm may decide to guess about customer income because it finds the pattern of increased luxury car sales correlated with higher income. That algorithm would miss that, in reality, the biggest problem is that your sales force is far underperforming competitors’ sales staff if you only fed the algorithm your company’s data. In this case, the real problem is your sales force’s skill compared to your competitor’s, but you’ve forced the algorithm to ignore that as a possible thing. Therefore, the algorithm guessing about attributes of customers’ income is no surprise. Note that the data is correct and the algorithm made a correct guess: more affluent customers mean increases in sales. However, by limiting the data you’ve shrunk the number of solutions the algorithm can consider and missed something that was an even bigger factor.
In another example of selecting the right thing to guess about, consider the opposite case where there is no simple or obvious solution to your problem. Let’s say you are a CEO hired to turn around advertising sales for a social media company in a highly competitive environment. The media landscape is changing rapidly and you need all the help you can get, so you start feeding an artificial intelligence algorithm all the data about your company and the business environment that you can get your hands on. The pattern recognition algorithm quickly tells you to have more congratulatory employee parties since they are highly correlated with big advertising contracts. Oh boy, is the algorithm in trouble?
The correlation of congratulatory festivities with big sales is not causation, of course. After reflecting on this absurd result, you’re tempted to hard-wire the algorithm to only make guesses about things that preceded a sale being made. However, if you did that, the algorithm could miss a pattern of junior employees quitting about six months after selling successful ad campaigns. A competitor is poaching your staff! Note that it doesn’t matter if the algorithm selected congratulatory festivities to guess about or it was the HR manager who always considers employee morale first. In complicated problems, guesses will get better with more data and more time to analyze it. The lesson is that you have to let the algorithm work its pattern recognition magic by sorting through the mounds of data, but take the guess with a grain of salt until it proves relevant.
Lastly, consider the attribute you are asking the algorithm to guess about. The algorithm is always tasked with optimizing something (i.e., what attribute is the most likely answer to your question). That’s why they call it “pattern recognition.” When optimizing something, two things can go wrong. Number one is the sensitivity of the optimization to small variations in the data; it’s something similar to the Butterfly-Effect. For example, let’s say you deploy a security system that uses facial recognition. You test it in Seattle for a whole year and it works fine. Then, when you install it in San Diego, it fails 50% of the time. Reason: there are sun-shadow angles in San Diego that never occur in Seattle. This is a similar problem to not having the right data to begin with, but more difficult to avoid. In this case, lighting is an obvious factor to consider in facial recognition, however exploring every lighting permutation is difficult. The lesson is to ask the same question while varying the input data to see how sensitive the answers are to changes. You’ll never be able to cover every possible variation, but more is better.
The other thing that can go wrong is how long does your question stay relevant? In other words, how does the problem change over time? For example, let’s say you have a delivery company and you ask an algorithm to schedule your drivers and their routes to maximize morale and package drop-offs. You put in the predicted customer orders (AI is very good at guessing this), the drivers’ desired schedules, and live local traffic. The algorithm chugs for a few minutes and out comes a schedule and routes. This works for several weeks with few hiccups until one day half the drivers are calling the manager to complain about traffic and working overtime to drop off all the packages. What happened? School started.
Note that in this last case, the data was correct when it was input, but then incorrect when school started. So, isn’t this a data problem? No, because any change in the conditions could make the question irrelevant. Gas prices may go up so far that you have to reason about fuel ahead of morale and drop-off rates to stay in business. That would be a change in the most relevant thing. The data, the thing(s), and the attribute are always related, but keeping the time-dependent changes associated with the analysis of the attribute ensures you can always address changes in data, the complexity (i.e., dimensionality) of the problem, and the requirements together.
In this last case of school starting, you could let the algorithm chug again for a short while with new data and the new output should be fine. However, algorithms sometimes take days to “retrain” to make new guesses if the conditions change. In fields where change is constant and time is of the essence like stock trading, the algorithms are essentially retraining continuously. That takes a ton of computing power, data storage, and big data bandwidths to keep the data and the analysis up-to-date. The lesson is to understand what resources you need to keep your algorithms up-to-date and the consequences if they get behind. Posting new driving schedules has much less regret than a client losing a billion dollars because of a two-second delay.
In summary, three things are always part of the pattern recognition algorithms that underlie smart things: the data, the thing(s) to guess about, and the attribute about that thing(s). The algorithms use the data to find the most likely answer (guess) to your question. Remember, it is a garbage-in garbage-out process, and even if garbage doesn’t go into the algorithm making a true guess doesn’t mean the algorithm has solved your problem. The guess about the attribute could be true, but about a thing that is irrelevant to your question. Second, mind how complicated the problem is. If it seems simple, but isn’t, the algorithm might just confirm your biases about simplicity. If it seems complicated, always do a sanity check on the algorithm’s guess. However, complex problems are where pattern recognition does lead to those “magically” correct guesses, so make sure there is enough data and you give the algorithm enough time. Lastly, problems always change with time, so the right pattern to recognize will change too and you’ll have to make sure the guesses are current.
You can now use what you have learned to test those “AI-enabled” and “Smart” marketing folks. Can they explain if the right data, the right questions, and the right pace of change will be incorporated into the fancy devices that they are trying to sell you?
Facility Management Consulting | FM Services | Asset Management | FM Strategy | Workplace Services | FM Software
6 年I'd like to see the use of AI implemented more in business.