Developing conversation apps using Actions on Google and Dialogflow
Saravanan Gnanaguru
Founder - CloudEngine Labs | Chief Technology Evangelist | HashiCorp Ambassador | AWS Community Builder | DevOps Cloud SRE Practitioner | Tech Blogger | Speaker | Mentor | Platform Engineering Expert | AWS | Azure | GCP
Introduction
The article could be useful for developers looking to start developing chat-bot apps using Actions-on-Google.
What is Actions-on-Google?
Google Actions, Google assistant and Google Home
Actions-on-Google are the chat-bot applications developed for Google Assistant. It can be triggered by just saying the "Invocation phrase" of the chat-bot applications.
Google assistant is freely available Android smart-phones, which has Android OS 6 and above. It is also available for Apple iOS 11 and above.
Another information is, Google home devices are in-built with Google assistant engine. So the chat-bot applications we develop, could be invoked in Google home devices, as well.
Thus, the Google actions are the applications for Google home smart home devices too.
Any developer can think of unique idea of a chat-bot application and develop using Google Actions console and Dialog-flow console.
Components of Actions-on-Google and Dialog-flow
Creating voice applications using Dialog-flow has below components and concepts.
Invocation Phrase
Invocation phrase is important component when creating voice chat-bots. So the phrase needs to be unique, which helps to trigger the conversation with your voice application.
There are two types of invocation types, Explicit and Implicit invocations
Explicit Invocation
This is simple type of invocation, which explicitly tells the Google Assistant to use your Action by name.
Let us take an example of the action named as "Computer Facts".
Trigger Phrase
Google assistant actions could be triggered using the phrase,
"OK Google, talk to ..."
"Hey Google, speak to ..."
Followed by the invocation phrase explicit invocation
Implicit Invocation
Every voice chat-bot will be serving a purpose. For example, if you create a bot for ordering food, there would be chances and user will ask,
OK Google, Order me a pizza
Or
OK Google, Lets order a healthy salad
So, asking the chat-bot for the specific action is called as "Implicit invocation", in this case.
Google has developer platform called "Dialog-flow". It will be helpful for creating more efficient and interesting voice action applications.
Dialog-flow and Components
Intents
Intents are the placeholders for the entire conversations. We can have many number of meaning intents in a Conversation. It is also possible to link the intents via contexts.
In Dialog-flow, the basic flow of conversation involves these steps:
1. The user giving input
2. Your Dialog-flow agent parsing that input
3. Your agent returning a response to the user
To define how conversations work, you create intents in your agent that map user input to responses. In each intent, you define examples of user utterances that can trigger the intent, what to extract from the utterance, and how to respond.
Intent is major component of Dialog flow Conversation between User and ChatBot.
Training phrases (Sample utterances in Alexa skill)
Training phrases are the ones been spoken to the ChatBots. For example: Asking a chat-bot for ordering a coffee.
As an application developer, we need to add as many training phrases as possible to increase the possibility of our application discovered when a user says. Order me a coffee.
Reason being, there are many similar applications serving the purpose for ordering a coffee. So we need to be unique in our Training phrases to increase the chance of application discovery.
These training phrases are referred as "Implicit invocations"
Responses
Responses are the ones being replied by the ChatBot when a user ask the application for specific task.
As a developer of the application, we need to think of most possible combination of ChatBot responses to the user queries.
Google Dialogflow has the facility to add static responses and dynamic responses as well.
Contexts
Contexts represent the current state of a user's request and allow your agent to carry information from one intent to another. You can use combinations of input and output contexts to control the conversational path the user takes through your dialog.
Entities (Slots in Alexa skill)
Entities are Dialogflow's mechanism for identifying and extracting useful data from natural language inputs.
While intents allow your agent to understand the motivation behind a particular user input, entities are used to pick out specific pieces of information that your users mention — anything from street addresses to product names or amounts with units. Any important data you want to get from a user's request will have a corresponding entity.
Below are the types of entities:
1. System Entities
Dialogflow is equipped with numerous system entities, which allow agents to extract information about a wide range of concepts without any additional configuration. For example, system entities are available for extracting dates, times, and locations from natural language inputs.
2. Developer Entities
If you need to extract information about concepts beyond those covered by Dialogflow's system entities, you can define your own developer entity types. For example, a brand might create an entity type to recognize its unique set of product names.
3. Session entities
It is also possible to define entity types that apply only to a specific conversation. For example, you might create an entity type to represent the time-sensitive options available to a particular user when making a booking. These are called session entity types.
Fulfillment
As we discussed earlier, Dialogflow response can be dynamic as well. User defined Dynamic responses can be done using Fulfillment code. Internally Fulfillment webhooks are using the node.js code deployed as Google Cloud (serverless) Functions in Google Cloud account.
Fulfillment is code that's deployed as a webhook that lets your Dialogflow agent call business logic on an intent-by-intent basis. During a conversation, fulfillment allows you to use the information extracted by Dialogflow's natural language processing to generate dynamic responses or trigger actions on your back-end.
Most Dialogflow agents make use of fulfillment. The following are some example cases where you can use fulfillment to extend an agent:
- To generate dynamic responses based on information looked up from a database.
- To place orders based on products a customer has asked for.
- To implement the rules and winning conditions for a game.
How to get started with Dialogflow quickly?
Anyone having Google account access would be able to getting started with Google actions and Dialogflow quickly.
Google has awesome documentation for developing Google actions application using Dialogflow.
Anyone can make use of the developer code-labs crash course to quickly getting started with developing Google actions.
We can develop more interesting voice interactions using Dialogflow console.
Further Information
Please get in touch with me, for help on getting started with Google Actions. Also refer my code samples in the URLs below.
Sample Webhooks
https://github.com/chefgs/df_fulfillment_webhooks
Sample Dialog-flow agent