You haven't yet saved any bookmarks. To bookmark a post, just click .

  • AI in RTC Report Highlights: Speech Analytics & Voicebots show the most promise

  • We hear it all the time - Machine Learning (ML) and its use in creating Artificial Intelligence (AI) applications is having a profound application on many industries. Tsahi Levi-Levent, blogger and one of the co-organizers with me in the Kranky Geek event series decided to team up to do a deep study on the application of Artificial Intelligence in Real Time Communications (AI in RTC). After a couple months of research, many dozens of conversations, and analyzing hundreds of products we put our findings into a 147-page report. See below for more on the study and some of my take-aways.

    About the Study

    From the outset, we decided to focus on four domains:

    • Speech Analytics – converting speech to text (STT) (aka ASR or transcription) and analyzing the waveform and converted text. I had done a lot of work here launching a speech analytics service and was familiar with the new, burgeoning ecosystem here.
    • Voicebots – automated programs that interact with users in a conversational dialog using speech as input and output like many IVR systems and Amazon’s Alexa. Like almost everyone, I hate IVR’s so I was very interested to see how advancements in this tech could make IVR’s suck less. I had also previously investigated the use of using voicebots in conference calls to help control the bridge and was wanted to explore this and other use cases.
    • Computer Vision (CV) – programs that analyze and understand images and video. I had done a lot of my own experimentation here on webrtcHacks. Other than some overlay features in social media apps and some virtual name tag demos from Cisco and Microsoft, I wanted to understand why video conferencing providers weren’t doing more with CV
    • RTC Optimization – machine learning methods used to improve VoIP media quality or cost performance. I knew the least about this area but given the recent focus on controlling bandwidth and error correction by RTC stack designers, I expected we would find some interesting research on this topic.

    The only other area outside of these our research uncovered was the use of machine learning for route optimization in the call centers – i.e. determining which agent to pass a given customer to based on the problem, agent expertise, user history, etc. We decided to not investigate this time around because while it can be RTC related, it did not involve processing the media stream and is a more general problem for all customer interactions.


    In addition to our own personal experience and on-going work, we conducted significant primary and secondary research. Our main source of information was company interviews. We identified more than 100 target companies that included:

    • RTC companies - telcos, CPaaS providers, cloud-based Unified Communications and call center providers
    • AI companies – speech analytics, voicebot, computer vision, and other ML-technology vendors

    In the end we interviewed about 40 of these companies and did deep reviews of the others. To supplement these interviews, we also conducted a web survey where we had 96 unique companies of all varieties respond.



    These interviews, subsequent analysis, and writing the 147-page report was major time commitment, unfortunately we can’t give everything away for free, but here are some of my take-aways:

    • Speech analytics is where all the action is – the majority of the companies we covered had some speech analytics initiative. I was not surprised this was such a popular area based on my recent professional work, but I was discovered many new ones and re-discovered some vendors who have been in this domain for a while.
    • Voicebots are the next big AI in RTC domain - the area has perhaps the move immediate potential for RTC apps as voicebot technology by the major cloud vendors is being commoditized and surpassing traditional conversational IVR implementations. The hard part here is integrating conversational AI tech with established telephony environments – I covered that in my last post.
    • Computer vision hasn’t received much attention – outside of social media apps, video RTC companies are just starting to look at it and CV-tech companies have been focused on other markets, like autonomous cars
    • Only big cloud vendors do everything - outside of the major cloud vendors – Amazon, Google, Microsoft, and IBM – the core machine learning technology vendors were only focused on one domain.
    • No one is using ML in RTC stacks - with only a couple exceptions – see 2Hz and some Mozilla research – hardly anyone is leveraging ML to improve their low-level VoIP mechanics.
    • Lack of ML expertise is an issue for RTC companies – lack of staff who know and can apply ML was cited as the number 1 inhibitor in our survey.
    • Promise & peril from partnering with big AI cloud vendors – the big cloud vendors are also big AI vendors. They are proving the value of AI technologies and democratizing ML tools, meaning it will only get easier for RTC companies to work with them. At the same time Amazon, Google, and Microsoft also offer their own communications products and services. RTC companies without their own ML expertise are in a difficult situation of relying on technology from companies that are growing increasingly competitive.

    Overall there are more use cases for using AI in RTC than I imagined. At the same time, very little effort has been placed exploring most of these use cases. In addition, most RTC companies are nowhere near the head of the curve when it comes to ML so don’t expect any immediate general market shifts. On the other hand, AI technologies are easier to find on opensource repositories and purchase than ever. Even if there isn’t always an easy path ahead, the potential here is extremely exciting. Based on our conversations and analysis, I expect AI in RTC will only become a bigger topic with a growing variety of implementations and use cases.

    More information

    You can see the full table of contents, list of figures, download a report preview over at For those with some budget for market research, we are offering a publication launch discount until September 7. Ask me for details or visit the report site to purchase.


    Chad Hart is an analyst and consultant with, a product management, marketing, and strategy advisory helping to advance the communications industry.

    Remember to Subscribe and Follow us.