You haven't yet saved any bookmarks. To bookmark a post, just click .

  • Google uses AI to make phone calls interesting again

  • Outside of the occasional complaining about robo-callers, traditional phone calls generally do not get much coverage in the news. Phone calling suddenly got a lot of attention after the Duplex demo Google showed at its I/O conference last week. In case you somehow missed it, Google showed some demos of its AI dialing small businesses on behalf of a user and then using ultra-realistic speech synthesis and speech recognition to book appointments and find business hours - just like a person would. The virtual assistant is essentially indistinguishable from a real person. This demo has generated a lot of coverage, both as a validation for how AI is taking over but also for the ethical issues it causes. I'll touch on some of that, but what I would really like to cover is the implications of a technology thats actively encourages calls over the public network instead of outright replacing them.

    We communicate all the time, but the communication is almost always via app or some other API - generally not with the dialpad. Sure technology like WebRTC has made it easy to embed real time voice and video calls into apps, but for the majority of apps calling is a secondary feature to text-based interaction. For most people these days, calling over the public phone network (i.e. the PSTN) via the dialer is generally a last-resort or reserved for urgent situations.

    Google could have continued down the route of making it easier for these businesses to put their interactions online. Instead they went in the opposite direction - they are actually making it easier for users to make calls to businesses.

    The Google Duplex Demo at Google I/O 2018

    Duplex could help 3.5 million US companies

    The scenarios they illustrated legitimately impact of lot of companies. Just before the demo, Google CEO Sundar Pichai claimed up to 60% of small businesses don't have an online booking system (and I am sure they checked their indexes on that one). That's a big number. I'm not sure what reference Google used to define a business as small, but according to SBA gathered data, there are more than 5.8 million businesses under 500 employees in the US. Those businesses have more than 6.3 million physical locations and employ close to half of the US workforce. That comes to about 3.5 million business that don't have online booking tools.

    One scenario Google mentioned was calling to check non-standard holiday hours - I am sure the number of businesses with commonly asked information that is not available on their website is even larger.


    US firm size with segmented into various employee sizes. Small businesses are most of the population. Source: SBA

    Will this tech generate more phone calls?

    Will duplex actually put more load on the phone system? Maybe.
    There are 2 high-level categories of calls that Duplex could address:

    1. Informational - call to get some data that is relevant to many users, like business hours, current specials, etc.
    2. Transactional - a call to book an appointment or to buy something

    Duplex and other tech like it will decrease informational calls. One of the examples Google gave on how this could help small business is by reducing the number of calls for basic informational data. When you Google a business they already try to give you basic information like opening hours when they can. If this data is not on the businesses website, Duplex could just give the business a single call and include this data the next time someone does a search for that business. Rather than the business receiving a bunch of phone calls about their hours, they would just get one (from Google users anyway) so this would reduce calls.

    Transactional calls are different, since they by nature require some interaction of user and some business data. In this case, duplex will likely replace traditional calls - instead of me calling to make a reservation, my Google Assistant is doing it. However, there is a good case to be made that the ease of using Duplex might actually make users more likely to ask their Google Assistant to make the appointment. If I am in a rush to leave in the morning and notice my sink leaking, I might try to make a mental note to call the plumber later, but will probably forget or push it off. If I could ask my assistant to take care of it on the spot it is way more likely to happen. Similarly, with a impromptu restaurant visits its not worth it to risk killing minutes to call and book a reservation - maybe you just show up and hope for the best. But if the checking for availability only took you seconds by asking your assistant, you would be way more likely to call.


    Google mentioned it would initially only support 3 scenarios - restaurant reservations, hair salon scheduling, and holiday hours. However, there are many types of calls one could imagine that are relatively simple and could likely be handled with the same technology - say calling your local town/city maintenance department to report a pothole or even to let a business know you'll be a few minutes late for an appointment. Generally the idea of spending time to navigate through a multiple department secretaries generally means you don't call unless you know how to get the information through to the right person quickly. If there was virtually no time investment, odds are you would call every time. Of course there are many, many more complex examples - I am sure the prospect of never having to navigate an IVR again might greatly encourage consumers to call more often.

    Is this good for businesses?

    When considering how this could impact general call volumes, it is worth asking if this is something businesses would really want. Certainly reading back business hours to customer is not a great use of time, but a savvy salesperson would likely ask other questions in an attempt to secure or initiate a transaction even if the call started as informational. Certainly many transactional scenarios involve upselling. Is duplex going to be able to handle those kinds of questions on behalf of a user? That is very unlikely leading to an awkward hand-off or a lost upsale opportunity for the business. Perhaps the user calling does not want to be up-sold, but certainly the business would.

    Of course there are also much larger user behavior issues. Once it is announced "this is Google Assistant calling on behalf of..." will the business agent on the other end want to answer? If assistant generated call volumes increase, will the business be able to handle it? Perhaps that is a good problem for them to have - or maybe it won't be if the calls don't end up as paying customers.

    Today there is no real way for a business to opt-out of getting calls from a random number. Maybe the number won't be random. Maybe Google will give businesses the option of not receiving duplex calls - but that would likely mean they need to sign-up with Google in some capacity.

    This tech could potentially work in the other direction too - with businesses calling their customers. Many businesses, like dentists, have dumb robodialers to call and confirm appointments. Something like Duplex would be a huge upgrade to the user experience and would better automate rescheduling.


    High level diagram of Google's Duplex. Source: Google Duplex: An AI System for Accomplishing Real-World Tasks Over the Phone

    When will other companies get this tech?

    Google introduced a considerable amount of new technology that is not widely used elsewhere to pull this off. Some of what went into Duplex includes:

    • Concatenative text-to-speech - using parts of recorded speech to play back
    • Speech synthesis - then it needs to generate synthetic speech waveforms in a natural sounding way, with all the appropriate intonations
    • Variable latency - it needs to be fast enough to respond in real time but smart enough to know when to delay responding to sound natural (like its thinking)
    • Speech Disfluencies - inserting "ah", "um", "well" and other fillers into the synthesized speech to match the way a person usually talks when they aren't reading from script
    • Speech recognition - their speech-to-text tech already has a high accuracy and narrow-band models for PSTN calling, but their post indicates they tweaked this further
    • Natural language understanding (NLU) - the ability to disambiguate words, match speech with expected intents, and having a good meter of confidence that the system is understanding the user so it can ask more questions or clarify statements if needed. NLU is part of many Google services today.
    • Dialog management - related to the above, the system needs to have a framework for navigating through a conversation with an interaction plan to bring it to a conclusion. DialogFlow does this generically today for Google Assistant apps, but Google stated they needed to have specific models for each call type - like a trained model just for making restaurant reservations
    • Data - training all the machine learnings models requires a lot of data. Google has several current and past projects connected to the phone network (remember GOOG-411?) to help them collect this data.

    Some of these technologies can be replicated to in some degree using other publicly available voice bot tools, including what is publicly available from Google, but I haven't see anyone else with the full-suite that is tightly integrated. Google indicated they would start testing this capability with the Google Assistant this summer. Given all the legal, ethical, and security implications, it does not seem likely Duplex will be made available to Google Assistant developers in that same time frame.

    Hello Mr. Voice Bot

    AI has become smart enough to overcome the dumbness of the phone network

    Will Google eventually make this available for broader use - I think that's likely. Will others try to replicate this, perhaps without the same cautions - that seems even more likely and probably not that far off. AI has become smart enough to overcome the dumbness of the phone network. Now robots are a whole new audience for telecom to go after. The implication here is that we better get used to the idea of robots substituting for people on the phone.

    Chad Hart is an analyst and consultant with, a product management, marketing, and strategy advisory helping to advance the communications industry. He is currently working on a AI in RTC report.

    Remember to Subscribe and Follow us.