Continuing on my quest to see all clunky, hierarchical IVRs die, I have been experimenting with Dialogflow's new Phone Gateway and Knowledge Connectors that were launched in late June at CloudNext. The Knowledge Connector extracts data out of a Frequently Asked Questions (FAQ) document and makes it available for the bot to use in its responses. The Phone Gateway includes lets you connect any Dialogflow bot to a phone line. Together, these two features make it relatively simple to build a voicebot replacement to a informational IVR with the added bonus that the bot can be reused on other media too.
Read below for:
- some background on Dialogflow and the IVR use case,
- a detailed walkthrough of how to do it in minutes, and
- some of my learnings and potential next steps.
Concepts and Ingredients
Why is replacing a traditional IVR with a voicebot better? What are these new Dialogflow features exactly? Let's review some of the core principals first.
Menus vs. Intents
Traditional IVR systems always use menus because that is the only way they could navigate through a bunch of explicit options. In this approach, the number of items in a given menu were limited by 2 things:
- The number of DTMF keys on the phone
- User patience to listen through a list of options and then make a selection
Voice recognition - where users can speak their options - is supplementing the DTMF entry. However, that technology isn't everywhere and the main challenge of how many menu items can a user remember before they forget and have to go back through the menu still remains. This means there is pressure to keep each menu short. To fit all the options needed under this constraint, IVRs must have a multi-layered menu hierarchy.
Imagine if people had conversations like this. When someone calls a receptionist at a business they don't spout out a menu, they ask "how can I help you" and then response with the appropriate answer. Intent-based Natural Language Understanding (NLU) systems that voicebots interact like a real person would. They take what the user says - their utterance - and match it to an intent which indicates a response. Conversational bot systems are actually much more dynamic than this with follow-on intents, response variations, and other options for slot-filling, but the core concept is the same.
As it competes with Amazon's Alexa, Google has been launching a bunch of new features for Dialogflow, its natural language processing and bot builder tool. Unlike Actions for Google, which only lets you create bots for the Google Assistant, Dialogflow has integrations with a wide variety of platforms including Alexa, Cortana, many popular chat platforms, and now the telephone network.
Informational Interactive Voice Response Systems
Interactive Voice Response (IVR) systems have become the bane of telecoms, but they were once a convenience for users. They help direct users and reduce time dealing with multiple transfers and clueless operators. The main primary functions of the IVR come down to:
- Directing callers to the right person
- Offloading agent (the person answering the phone) by providing or collecting information before the agent gets on
Dialogflow can help with both of these, but here I am going to focus on the offloading part. Large business like banks have sophisticated systems that let you make self-serve transactions (today as an alternative to doing that on the web or an app). Many smaller organizations still have IVRs that provide basic information - opening hours, addresses, etc. Even more sophisticated systems usually have pieces of their IVR that does the same thing. For this post, I am going to focus on these "Informational" elements that can be provided by an IVR.
Dialogflow Knowledge Connector
Web-based Frequently Asked Questions (FAQs) have become a popular content format on the web mainly because of the way Google search provides featured snippets. FAQs are usually easy to write and contain a lot of good information. The problem is FAQs are often not displayed prominently on a website so often primarily for the search engines. In addition, having them doesn't help if someone calls in on the phone unit now.
The concept of taking FAQ content and adding to a bot is straight forward, but the work of populating that data is tedious. Microsoft had a tool called called QnA Maker that helps to automatically parse Q&A style documents to populate a bot. Just point it at a webpage or document and it uses NLP technology to suck in the data.
This summer Google released something similar in Dialogflow called Knowledge Connectors.
The other big piece here that Google recently launched is the Phone Gateway. I have talked a few times on this blog about voicebot gateways from vendors like VoxImplant and Signalwire that added a telephony gateway to Dialogflow. Now, much like IBM, Google has added its own Phone Gateway.
Google provides access to numbers directly from within Dialogflow interface. If all you care about is connecting a phone number to Dialogflow, I don't think anyone could offer a simpler way to do it. Better yet, if you just need a number for testing Google gives them away for free for 30-days.
If you want a permanent number they sell those too - you'll just need to sign-up for their Enterprise plan which will increase your bot price from totally free to $0.05/minute to $0.075/minute depending on the plan and number type.
Presumably their Contact Center AI telephony partners provide something similar to their customers from their own interfaces.
How To Guide
Here is a step-by-step guide on how to:
- create your agent,
- add the phone gateway integration,
- import a number of FAQs with the knowledge connector,
- make the bot more conversational,
- transfer calls to a human.
Dialogflow has many features and I am not going to go into most of them. Also, this is just a quick and dirty demonstration, so remember this is a lot of tweaking and improvement you would want to make in a real bot.
Get an account, create an agent
If you don't have a Dialogflow account already, go to dialogflow.com, sign-in with a Google account, and pick your account settings.
If it is a new account, you will be forced to create an agent:
For reasons I will explain later, I decided to make a Star Wars movie hotline.
Edit your default intents
Next you will come to an Intents screen (or click on Intents on the left).
The Default Welcome Intent is the first thing your bot will say. The Default Fallback Intent is what the bot says if it does not understand the input.
Click on Default Welcome Intent. You'll see a bunch of training phrases and then you will see a bunch of Default text responses. I modified mine to respond with some statements prompting the user to ask about Star Wars:
Click on Save and you should see some messages saving Agent Training Started and a completion message soon after. Once that's complete, you can do a quick test by typing in your query in the upper right. It should return with one of your responses. Note how you can enter nearly any greeting and it will respond. If you have a non-standard greeting it does not pick up, just add that to the Training Phrases.
Setup the Phone Gateway
Before we go any further let's setup our Phone Gateway and make sure we can call into our bot. Click on the sandwich menu icon in the upper left and click on Integrations. There are a lot here, but since the scope of this walk-through is an IVR voicebot, pick the Dialog Phone Gateway icon. Then you'll get a screen like this:
This is where you can pick a phone number and other options. The menu only lets me pick English, but I assume other language options will be available over time. Optionally you can enter some area codes that you want your phone number to start with. Pick a phone number, click create, wait for the confirmation and then close out of there.
That's it - now you have a phone number you can use for 30 days. Per the Note in the screen above, if you want a number for longer than that then you need to sign up for the Enterprise Edition.
Give your number a call and you should hear your welcome intent. It should respond to greetings, but not much else, which isn't very useful so let's move to the next phase.
Usually programming a bot involves manually create a bunch of intents and responses like we say on the Default Welcome Intent screen. Dialogflow has some Prebuilt Agents you can modify if they happen to fit your use case. This saves some time, but it is possible to skip the Intent entry step entirely using Knowledge Connectors. As explained earlier Knowledge Connectors lets you populate the Intents and responses automatically from a Frequently Asked Questions (FAQ) document.
Enable Beta Features
Google's is in Beta, so we will need to enable the feature. To do this click on the gear icon next to your bot name in the upper left and then click the slider for Enable beta features and APIs and click save:
Add Knowledge Base URLs
Now head back to the left menu and pick Knowledge and then Create Knowledge Base.
Then you'll need to pick a name for your knowledge set.
After that you will see a link to create a knowledge document.
From there you will need to fill in the form with a document name, FAQ for the knowledge type and text/html if you are using a URL, followed by the URL itself:
Click Create and a few moments later you should see the document in the list.
If you don't see the document, jump down to my issues section below. If you do see the document, you should be able to open it and see the full Q&A. That screen will also let you disable individual Q&A pairs.
I went back and added a bunch more FAQ documents to round out my knowledge set. Make sure to hit save when you're done.
Give your number a call and ask it some questions related to the content in your FAQ's and you should get answers. I found I had to turn up the preferences to return Knowledge results in the main Knowledge tab for better results:
Once you start adding other intents this will need to be tested and adjusted more.
Add Small Talk
Without any real data entry, your bot should be in decent shape, but it is hardly conversational outside of the narrow FAQ documents you gave it. Dialogflow has another feature called Small Talk that provides default responses to popular filler discourse, like "thank you".
Go over to Small Talk in the menu and enable it.
We made it pretty far without doing anything complicated, but our bot is missing a few features you would have:
- The ability to hang-up
- Option to talk to a human
We never setup a way to end the conversation. The user can hang-up, but the usual protocol is to say some kind of farewell first before hanging up. This is easy to add and is included in most of the Prebuilt Agents if you started that way.
Go to Intents. Create a new intent and call it something like "Hangup" and then add some farewell training phrases. Then, customize your Text response and make sure to click the Set this intent as end of conversation.
Make sure to do some tests. I found phrases like "I'm finished" matched some of my knowledge base articles. Adjust the Knowledge results preference slider if you need to.
Transfer to operator intent
To make this work cleanly we are going to add a new intent so the user can signal they want to talk to a human. Then we will make a follow-up intent to confirm and transfer the call. Technically this could be done in one step, but since this was meant to be a FAQ bot we don't want users to be transferred accidentally.
Click on the plus sign next to Intents and create an intent called Operator. Add some "speak to a person" training phrases and add a confirmation prompt as a response.
As always, make sure to save, let it train, and then test.
Now go back to the Intents screen, hover over the operator intent you just made and create a follow-up intent:
Select "yes" from the drop down menu - this will auto-populate confirmation intent phrases.
Set it to dial out
Lastly, go to the Responses section, click the "+" and select Telephony.
Click on "ADD RESPONSES" and select "Transfer call". Then just put your phone number in the box.
Give your voicebot a call and make sure it works.
Tips, Tricks, and Issues
It is important to remember that the Dialogflow Phone Gateway and Knowledge Connector are in Beta. Google's "beta" usually just means "new" but I encounted a bunch of bugs and glitches so it see this is really in Beta.
Don't use Google Voice
Don't use a Google Voice number with the Phone Gateway - it doesn't work.
Note: Calling the gateway from a device using Project Fi, Google Voice, or Google Hangouts is not currently supported.
This was painful for me since I do not have cellular reception in my office and generally use my Google Voice number to dial over VoIP. Ironically I ended up using another bot - Alexa on my Amazon Echo - to place test PSTN calls to the voicebot.
Parsing FAQ documents
The Knowledge connector was super finicky for me. I originally wanted to build voicebot on the AI in RTC report FAQ but I could not get it to parse the document. I kept getting "failed to crawl" errors, despite several attempts to restructure the document in several ways.
That FAQ built in Wordpress. Other FAQs I made in WordPress in the same manner parsed just fine, so I am not sure what it is.
I recommend just using a CSV file as I'll discuss below.
No interrupting playback
A lot of the answers I got off of the IMDb movie FAQs were very long. Unfortunately I was not able to find a good way to "Cancel" in the middle of playback. On the Google Assistant you can always do an "OK Google" to interrupt the response playback, but I did not see a way to implement that mechanism on the Phone Gateway.
Also, for the really long ones, occasionally the intent recognizer started up again before the previous text-to-speech response even finished. I suspect this is a bug.
Keep FAQ responses short
In most cases users don't want to hear a bot drone on, especially when they can't easily cancel. It is supposed to be a dialog after all, not a lecture. For this reason I recommend keeping the FAQ responses to a few sentences vs. a few paragraphs.
Unfortunately there is no easy way to break up long responses to allow the user to choose to continue or not. You could do this if you converted all the FAQs questions to intents and then break them up to a series of follow-up intents. Dialogflow does allow you to easily convert a FAQ to an intent. You could use this feature to break up the text into follow-up intents, but that involves a lot more work and assumes your text is already structured in a manner that makes sense to interrupt in the middle and say "would you like to hear more". Alternatively you could directly interface with Dialogflow's APIs and webhooks for more control, but that requires a developer.
No editing FAQs in Dialogflow so use CSV's
Unlike Microsoft's QnA Maker, there is no way to edit or add the FAQ items within the GUI after they are imported. Given all the hassle I had parsing HTML pages, I think its easier to scrape the page content into a CSV where you can edit the text outside of Dialogflow and reimport. That also gives better controls over content versioning.
Sometimes the interface is slow, be patient
I found in a bunch of cases changes wouldn't stick. It seems sometimes the interface is slow. Make sure to wait for the "Agent training completed" toast message before testing something new.
Other times I would get random 500 errors. They eventually went away on their own.
I tried creating another bot for the AI in RTC report FAQ. Since I still could not import from the URL, it took me about 10 minutes to copy and reformat the contents in a CSV file. From there it took me about 25 minutes to walk through these steps, including testing, to make it work. It is very far from perfect but not too bad as a base to get started in a short period of time.
Give it a try at +1 617-380-7150 and see how it works for you. In the mean time I will think about turning this into a proper voicebot!
Chad Hart is an analyst and consultant with cwh.consulting, a product management, marketing, and strategy advisory helping to advance the communications industry. He recently co-authored a study on AI in RTC - check it out at krankygeek.com/research.