登录查看更多内容

Soon, we will have a bot for everything

Siddhartha Lahiri

CTO @ Valuefy | Scaling FinTech Innovation with Agents & AI | Building #1 B2B Wealth Tech | Ex-Publicis Sapient | Mentor & Tech Leader | Growing Engineering Teams

发布日期: 2017年2月28日

Imagine communicating with machines by writing or speaking using our natural language. While this might have been fiction in the past, it has now become a reality with the emergence of conversational applications or chatbots. And, soon, we will have a bot for everything.

“By 2020, autonomous software agents outside of human control will participate in five percent of all economic transactions.” – Gartner

2016 saw an eruption of chatbots and conversational systems, a disruption driven by the simultaneous growth of messaging platforms, progress, and ease of access to artificial intelligence (not to mention APIs). In this article, we will address how a chatbot works, simplify the concepts surrounding it, and hopefully inspire all to build bots.

How does a Chatbot work?

Let’s first understand the difference between traditional and conversational applications. A traditional application such as a mobile app or website works in a point-and-click fashion. Its interfaces are built on blocks of elements with which users can interact via limited actions (e.g., click, type, touch, or swipe). This arrangement is extremely convenient and efficient for a computer as there are finite interaction points, often in a sequence. Developers can, therefore, write code for each finite set of interactions very quickly.

That being said, there are also some challenges presented by traditional applications. First, the user must understand the flow required to get the work done. While many flows are commonly used and seemingly simple, specific business domains might necessitate user training. Second, if additional requirements get added, then new user interface (UI) elements and interactions must be introduced.

Conversational applications, on the other hand, take the command from the user in the form of his/her natural language. The example illustrated below is a simplified version of a multi-dialogue, chatbot interaction for buying groceries.

While many may argue that chatbot interaction may be cumbersome as users have to type what they want instead of simply clicking a few times, the statistics on messaging platforms say otherwise.

“Users around the world are logging in to messaging apps to not only chat with friends but also to connect with brands, browse merchandise, and watch content. What were once simple services for exchanging messages, pictures, videos, and GIFs have evolved into expansive ecosystems with their own developers, apps, and APIs.” – Business Insider

We haven’t quite made the full switch from traditional to conversational applications. For now, while the speech recognition and natural language comprehension continues to evolve, we will see many hybrid interfaces making the best of both worlds.

How do we make an application conversational?

A conversational application's primary aim is to translate natural language into user intent. The intent, in this context, is the command the user intends to execute. The conversational app can either be rule-based or actions-based.

Actions-Based

Natural language processing (NLP) is the class of artificial intelligence (AI) algorithms that enables a computer to understand human language and process commands. These actions-based, conversational applications basically convert human language into bits and bytes.

Consider the lifecycles of human beings. Children are taught by feeding them information. As they grow, their interactions with the environment continue to develop their intelligence. NLP works in a similar fashion. Initialized with a set of training data, the AI builds upon its learnings via usage and interaction.

Breaking down a Chatbot

From this point forward, we will focus on the application of natural language processing in a chatbot and the key concepts applied by current bot platforms and software developers’ kits (SDKs). The objective is to become aware of the ecosystem and quickly start building chatbots of your own.

Conversational Channels

Conversational channels are like the eyes, ears, and mouth of a chatbot. The most common conversation interfaces are currently text and voice as they allow interaction via natural language. Also, chat platforms have become popular mostly through our mobile devices and desktops, which are ideal for text interfaces. Therefore, while multiple interfaces in addition to voice and text exist, we will focus primarily on these two.

Text-Based Channels

These are simple chat platforms that allow you to communicate with bots via text, which their NPL algorithms can directly consume. These interfaces can be completely custom-built as mobile, desktop, or web applications, or they can be integrated with existing messaging platforms. A few key examples of text-based messaging platforms include Whatsapp, Slack, Tropo, Line, KIK, and Facebook Messenger. These chat platforms provide web-based API hooks for transmitting the text received via their chat interfaces to a chatbot service. If the interface is the platform, then the chatbot can be developed and exposed as an API.

Voice-Based Channels

Voice or speech interfaces like the Amazon Echo allow users to converse with bots by simply speaking. Since the chatbot only understands communication in text format, an additional interface is needed to convert speech into text (and text back into speech when the chatbot responds). A few notable platforms/services that provide speech-to-text and text-to-speech services include Google’s Speech API, Amazon’s Voice Service, IBM’s Watson Speech API, Microsoft’s Azure Speech API, and API.AI.

The Chatbot core

Let’s try to see and dissect a chatbot’s inner workings. How does it understand language, intelligently process commands, and respond as natural (read: human) as possible? Every time a user tries to communicate with the chatbot, he/she has the “intent” of asking a question or giving a command. Natural language processing (and the algorithms supporting it) is responsible for figuring out that intent based on the inputs the chatbot receives.

General English Language

Chatbots can be taught general, spoken English or any other language by giving it predefined learning data. For example, hello, greetings, or hi are understood as an intent of “salutation.”

Domain Specific Language

Unlike generic vocabulary, vocabulary specific to a business can be interpreted differently. Let’s take “I am planning to travel to New York” as an example. The phrase can be interpreted by an airline service as the intent to book a “flight,” while a hotel would interpret it as intending to book a “hotel room.”

Ideally, we would want a chatbot to be very open-ended and have conversations with much wider contexts. Since these kinds of open domain bots are quite complex, most of the bots today are dedicated to specific businesses or domains.

Let’s assume that the chatbot is focused on the “airline domain” and is connected to the business API of the airline’s booking system online. Based on the “travel” verb, the chatbot understands that the user intends to travel and knows that it needs to call the API “searchTravelOptions” before it can book any flights.

To make a successful API call, the chatbot also needs to identify the parameters required to complete the operation. NLP applies a concept of named entity recognition, which enables the bot to associate the parameters with known information like place, time, date, etc. For example, “New York” can be associated with either “Destination” or “Source” based on the named entity training data with which the chatbot is pre-loaded. Similarly, if the user had given a date, then it could be associated with either “Travel Date” or “Return Date.” To evaluate these possibilities, the chatbot uses prepositions such as "from" or "to" to accurately identify an entity. For example, “from location” signifies the source, while “to location” signifies the destination.

Dialogue

In the aforementioned example, not all entities are provided to the bot in a single sentence. The bot platforms are, therefore, equipped to construct “dialogues” or series of conversations in order to complete a process. As you can see below, the chatbot continues to have a dialogue with the user until all the information necessary for a “searchTravelOptions” API call is gathered.

Context

In the example conversation above, the chatbot is aware of the user’s current location based on his mobile’s GPS and assumes the source. The chatbot can use the information gathered from mobile devices to determine physical context (like location, speed, etc.) and it can also use saved user data (such as class preference) to determine domain context. Context awareness and the ability to derive entity information makes a chatbot more aware and human.

Unsupervised & Supervised Learning

The identification of intent and entities enables the chatbot to know which API to call, what data to fetch, and which parameters to pass. The pre-loading and classification of Common Vocabulary, Domain Specific Vocabulary, Named Entitles, and Domain Specific Entities can, therefore, be deemed the chatbot’s unsupervised, “learning” processes.

However, there will be many instances when the chatbot will not be able to accurately translate a user’s phrase into intent. For this, all bot platforms allow developers to review missed translations and manually label these phrases with their appropriate intents. With this process, the chatbots learn from their mistakes or “lack of knowledge” in a supervised environment. In the example below, the bot cannot associate “whazzup” to any intent and has asked the developer to associate it to the appropriate one.

How does a Chatbot respond?

Now that we have discussed how a chatbot processes natural language, let’s discuss how chatbots respond to questions or commands in a manner as similar to natural, human responses as possible. There are multiple algorithms and models that allow a chatbot to determine its responses, but we will touch on only one approach: the retrieval-based model.

Retrieval-Based Model

A predominant approach due to easy implementation, the retrieval-based model involves a predefined response to a command or question. The response can be static or selected from a predefined set of commands based on rules or persona information (that of the user interacting with the chatbot). While this approach may seem smarter, the truth is that responses are limited to a finite set of vocabulary.

So, how can we make chatbots more perceptive? The more context a chatbot has, the more intelligent it can become. The chatbot can begin to select responses based on the user’s mood, physical, or linguistic context. Services like IBM’s Watson? Tone Analyzer and Personality Insights can be used to gather this user context, and change the style or flow of the dialogue accordingly.

Let’s build Chatbots!

There is an enormous list of available chatbot ecosystems and platforms, along with many tutorials that can help those looking to build chatbots. These platforms are very simple and easy to use, and do not require vast amounts of artificial intelligence knowledge.

The chatbots and machine intelligence space is expanding at a rapid pace, and has to be taken seriously by organizations and individuals alike. In the near future, AI and machine learning will shift from being the domain of a closed community to touching every sphere of our lives. Being aware of this space will be as important as knowing how to operate a smartphone.

要查看或添加评论，请登录

Siddhartha Lahiri的更多文章

Setting Up a Remote Development Environment with VS Code Server and Docker

2025年2月10日

Setting Up a Remote Development Environment with VS Code Server and Docker

In today's distributed development world, having a secure, compliant, and accessible development environment is…
The Elicitation-Why Framework: Transforming Performance Reviews into Growth Conversations

2025年2月8日

The Elicitation-Why Framework: Transforming Performance Reviews into Growth Conversations

Performance reviews don't have to be about pointing fingers or highlighting failures. They can be transformative…
Edge AI: The Future of Distributed Intelligence

2025年2月1日

Edge AI: The Future of Distributed Intelligence

What is Edge AI? Edge AI represents the integration of artificial intelligence (AI) with edge computing, enabling data…
Building Simple AI Agents: Insights from a Loan Processing System

2025年1月26日

Building Simple AI Agents: Insights from a Loan Processing System

An AI agent is a program that combines an LLM (Large Language Model) with the ability to execute functions. It…

2 条评论
From Scarcity to Star Trek: Can a Nation Transform Itself in 50 Years?

2025年1月7日

From Scarcity to Star Trek: Can a Nation Transform Itself in 50 Years?

This speculative journey explores how a nation might evolve from grappling with inequality to achieving Star Trek's…
From Developer to Engineer: Why Open Source Changed My Code Forever

2024年12月27日

From Developer to Engineer: Why Open Source Changed My Code Forever

"Any fool can write code that a computer can understand. Good programmers write code that humans can understand.
Leadership Lessons from "Gifted": Embracing Young Talent

2024年12月24日

Leadership Lessons from "Gifted": Embracing Young Talent

As the new year unfolds, I've been reflecting deeply on my leadership journey. Among my resolutions for 2024, one…

1 条评论
Burden of Command: My Journey from Structure to Startup

2024年12月20日

Burden of Command: My Journey from Structure to Startup

After decades in a structured consulting organization, I made a decision that raised eyebrows among friends and worried…
Learning from What We Can't See: Learning from World War II

2024年12月15日

Learning from What We Can't See: Learning from World War II

During World War II, the Allied forces faced a critical decision about their bomber aircraft. As planes returned from…
Maximizing AI's Potential: A Blueprint for Enterprise Success

2024年12月14日

Maximizing AI's Potential: A Blueprint for Enterprise Success

The AI revolution is transforming enterprises at an unprecedented pace, opening doors to innovation we could only dream…

See all articles

Soon, we will have a bot for everything

Siddhartha Lahiri

CTO @ Valuefy | Scaling FinTech Innovation with Agents & AI | Building #1 B2B Wealth Tech | Ex-Publicis Sapient | Mentor & Tech Leader | Growing Engineering Teams

How does a Chatbot work?

How do we make an application conversational?

Actions-Based

Breaking down a Chatbot

Conversational Channels

Text-Based Channels

Voice-Based Channels

The Chatbot core

General English Language

Domain Specific Language

Dialogue

Context

Unsupervised & Supervised Learning

How does a Chatbot respond?

Retrieval-Based Model

Let’s build Chatbots!

Siddhartha Lahiri的更多文章

社区洞察

其他会员也浏览了

A Conversational Agent with a Single Prompt?

What is the Difference Between Conversational AI & Chatbots?

How tailored language models make AI more personal and powerful

Real-time speech-to-speech AI: The next step in conversational tech

Designing Inclusive AI: Anablock’s Approach to Building Multilingual and Multicultural Chatbots

The Future of Conversational AI: Trends Shaping 2025 and Beyond

10 Most Funded Chatbots in India

Conversational Intelligence made easy

5 Articles on Conversational AI I read in September

Power Virtual Agents: Conversation Boosters

How does a Chatbot work?

How do we make an application conversational?

Actions-Based

Breaking down a Chatbot

Conversational Channels

Text-Based Channels

Voice-Based Channels

The Chatbot core

General English Language

Domain Specific Language

Dialogue

Context

Unsupervised & Supervised Learning

How does a Chatbot respond?

Retrieval-Based Model

Let’s build Chatbots!

Siddhartha Lahiri的更多文章

Setting Up a Remote Development Environment with VS Code Server and Docker

The Elicitation-Why Framework: Transforming Performance Reviews into Growth Conversations

Edge AI: The Future of Distributed Intelligence

Building Simple AI Agents: Insights from a Loan Processing System

From Scarcity to Star Trek: Can a Nation Transform Itself in 50 Years?

From Developer to Engineer: Why Open Source Changed My Code Forever

Leadership Lessons from "Gifted": Embracing Young Talent

Burden of Command: My Journey from Structure to Startup

Learning from What We Can't See: Learning from World War II

Maximizing AI's Potential: A Blueprint for Enterprise Success

社区洞察

其他会员也浏览了

A Conversational Agent with a Single Prompt?

What is the Difference Between Conversational AI & Chatbots?

How tailored language models make AI more personal and powerful

Real-time speech-to-speech AI: The next step in conversational tech

Designing Inclusive AI: Anablock’s Approach to Building Multilingual and Multicultural Chatbots

The Future of Conversational AI: Trends Shaping 2025 and Beyond

10 Most Funded Chatbots in India

Conversational Intelligence made easy

5 Articles on Conversational AI I read in September

Power Virtual Agents: Conversation Boosters