How Natural Language Processing is Shaping the Future of Search
Natasha Ellard
We work with senior marketing leaders who need to move from tactical execution to strategic, revenue-driven marketing—while proving ROI to leadership teams who don’t always ‘get’ marketing
The world of search is always evolving. It's how Google have managed to maintain their dominant position as the planet's search engine of choice. Search engines that failed to innovate lost ground and that's why we today find ourselves in a landscape where we've plenty of choice, but only one true player, with Microsoft's Bing a very distant second (its global share remains below 9%).
As search evolves different technologies are involved in serving us the results we crave. One such technology that has exploded in value and popularity in recent years is that of natural language processing (NLP) which is used by all the primary search engines in some form or another, as well as being found in a whole host of everyday environments, even those where we're not staring at a screen.
NLP is everywhere but just how advanced are the machines at this moment, where did it all begin and what's next?
So What is NLP and How Does it Work?
In basic terms, NLP is a way to have user queries processed by machines in a way that can understand and react to the natural language we humans use when interacting with one another.
We see it commonly used in AI-powered voice assistants like Alexa and Google Assistant because we want these devices to support us in a manner that mimics a human assistant. Whilst we've become accustomed to typing out commands to computers in their own language, with hands-free voice commands we want the machines to communicate with us on our terms.
It works by converting the words we use in natural language into machine readable language and this has gotten more and more reliable over many years of processing and interpreting the way humans interact with one another. As computer processing power has improved, the ability to "learn" more about how people talk with one another has hastened dramatically. In fact the ability to correctly identify context and meaning from a given voice instruction still requires a phenomenal amount of computing power which is why our Alexa devices or phone assistants won't work without an internet connection. The devices themselves can't compute the interpretation quickly enough and instead the query is sent to huge data centres to be processed and responded to, then sent back to your device, seamlessly.
This is beneficial because it feels more natural to ask a question to a device as you would another person, instead of having to adapt the way we talk to simply spit out combinations of words we think will yield the desired results. To a search engine the term "Swiss President" would be processed as a query over the identity of the current president of Switzerland but we'd never bark "Swiss President" at another person and expect a coherent response. "Who is the current president of Switzerland" is how you might pose this question to a friend or colleague, and is also the best way to get the desired answer from your digital assistant (at the time of writing it's Ignazio Cassis in case you were wondering).
Where and When Did NLP Start?
Being able to process natural language is certainly not a new phenomenon but the technology that underpins it has had to catch up with the demand. Sci-fi stories from as far back as the 1920s would feature humans conversing with computers long before we'd even mastered getting them into our homes!
Early attempts at artificial intelligence and voice recognition software were pretty laughable and it wasn't until Apple unveiled their Siri digital assistant embedded into iPhones back in 2011 that the wider world began to take the idea of interacting naturally with computers more seriously. Of course being the first to market has been both a blessing and a curse for Siri which can be proud of kicking off the era of home AI integration but is also commonly considered the least reliable of the main digital assistants on the market today. Amazon's Alexa came along in 2014 and put Siri to shame whilst Google's boringly named Google Assistant followed 18 months later and improved yet further on the groundwork of the earlier assistants.
Somewhere along the line Microsoft unveiled their own AI powered digital assistant branded Cortana but it never really took off and the company appears to have shifted away from trying to rival the Amazon, Apple and Google powered equivalents. The Cortana app has been discontinued and its integration with Windows has been scaled back in recent editions.
NLP in Search
The digital assistant boom is the most obvious example of where NLP is popularly used today but these devices provide functionality way beyond serving search queries, from scheduling appointments and setting alarms to controlling devices in the home such as lights and heating. They'll perform searches for you too, but if we want to analyse NLP's place in search alone we need to go back further.
领英推荐
Perhaps the earliest example of an NLP powered search engine is MIT's START, which has been online since 1993 (predating Google by five years). It bills itself as "the world's first web-based question answering system" and the homepage lists a selection of example queries you may wish to ask it such as "show me a map of Denmark" and "How is the weather in Boston today?" The fact that these queries could be more quickly served by simply typing "Denmark" or "Boston weather" into Google simply highlights how far we've come in terms of machine learning being able to serve our queries most efficiently.
However there are still some notable differences between START and modern day search engines. Primarily START seeks to serve up the answers to specific questions and nothing else. If you feed it a query it understands then it will respond with the black and white answer, without suggesting anywhere for further reading. One example of this is in inputting "What cities are within 250 miles of the capital of Italy?" to which it responds with a list of eight Italian cities along with each's distance from Rome (which it also determined correctly is Italy's capital). Try the same query in Google and it returns 322 million results but these are all links to websites featuring statistical information and actually the factual answer to the question is nowhere to be found.
Another early example of an NLP focused search engine is Ask Jeeves, now known as ask.com. Launched just before Google, it popularised the idea of asking questions of search engines using natural language but over time it lost ground to Google and abandoned performing its own web crawling back in 2010 in order to focus on answering straightforward questions alone. Its search engine functionality is now outsourced (most likely to Google although this has never been confirmed) and its question answering ability seems pretty limited now. In fact the input "how many ounces in a pound" returned a page of 10 search ads sandwiching a selection of links to sites that may or may not address the query. The same query fed to Google returns the correct answer immediately.
NLP in Search Today
As already discussed, it's easy to find examples of NLP being used to serve search queries in Google but there's still a big difference between the functionality of your Google Assistant and Google search on your desktop or mobile. In part this is because most people interact differently with Google when they're typing into its search bar as opposed to when they speak to it.
Consider the following example. Ask your Google Assistant how long do we have until Christmas and it will give you the number of days remaining before 25th December, then if you follow it up by asking "what day of the week will that be?" it will tell you the 25th of December is on a Sunday this year. The second query makes no sense by itself but in the context of following the first query, Google is quickly able to understand what information you're after.
Let's try the same thing on Google's website. The first query returns the same answer but following that with asking about the day of the week does not yield the contextually appropriate response. Google has instead determined you wanted to know what day oft the week it is today. The two queries have been treated as completely independent of one another.
But does Google search need to be processing queries like this in the same fashion as its voice assistant is? After all, we've been conditioned to ask our search engines questions using keywords as opposed to forming sentences. Just typing "Christmas Day" will quickly return the day of the week that Christmas falls on this year, and that's considerably less effort than phrasing it as a question in natural language. But barking "Christmas Day" at your Alexa will simply send it into a panic. There's zero context to the query and it wants to understand your intent in a way that search engines are better able to determine with minimal input.
NLP Into the Future
Google, Bing at al will continue honing how they deal with queries presented to them through natural language, and as younger generations become used to conversing directly with their devices instead of pandering to the way in which machines would best traditionally understand us we will likely see far more integration of NLP across the wider web.
One example is andisearch.com which claims to be the "next generation" of search, using an AI assistant to answer complex questions in a conversational fashion. So effectively it's behaving like your digital voice assistant but with a text input interface. Nothing too radical there. However where it might have a leg up on the competition is in its privacy credentials. By pledging to be free from ads and tracking, it could be seen as a more intelligent rival to DuckDuckGo or Mojeek.
That notion of intelligence, however, is put into question when performing the search "who is the US president" in Andi, to which it responds "the president of the United States is the head of state and head of government of the United States of America". Technically correct but doesn't actually answer the question of who is currently in the role. If you follow up with "how tall is he" (as you might do in a genuine conversation following up on your first question) it comes back with information on the Gateway Arch in St Louis, Missouri. Work in progress then.
Do we need to get used to conversing with machines like we do one another? Perhaps. But for now typed inputs work best in the fashion we're used to, dumping the fewest possible terms into a search bar with no effort to speak as we would with other people.