With the growing interest in bots populating chat (e.g. Slack, Facebook Messenger) and voice platforms (e.g. Alexa), it is important to keep in mind that it is easy and even tempting to create systems that end up being useless because they deliver inane user experiences that are just a waste of everyone’s time. We don’t even need to go as far as Clippy to find examples of just how stupid bots can be, despite the confidence of their makers. Just think of the total amount of time wasted by air travelers using the infuriating bots manning the phone reservation systems of just about every large airline. “Please say mileage account, flight status, reservations or something else.” It takes half an hour to get through a simple booking request, with each turn of dialogue adding a single piece of information: departure airport, departure time, destination airport… Yes, the bot can accomplish the task, but at a large cost of the user’s time and patience. Before you say that times are different and that those bots use inferior technology, take a look at the much hyped weather cat on Facebook messenger. Yes, we are still creating stupid bots that make things much more inefficient than just opening a web site.

Fortunately, the new wave of bots is not all bad. The Alexa platform powering the Amazon Echo delivers a surprisingly satisfying speech to speech experience. Although many attribute Alexa’s strengths to improvements in speech recognition, there is a lot more to it. Alexa’s natural language understanding (NLU) is equally if not more important. It is not enough to know what words someone has said; in fact, it is more useful to know the intent behind what was said. NLU and speech are also crucial to Siri and Google Now, and some form of these technologies are now available to any developer who knows the right APIs. Unfortunately, these building blocks will leave any bot stranded in the same one-shot island as Siri, Google Now and Alexa: you say something, and either you get what you want or you don’t. That’s what current language understanding gets you, but that’s far from how we understand language. Imagine calling customer service on the phone and hoping that somehow the very first way you have to describe your problem takes care of all you communication needs. Fortunately, that’s not how people communicate. We understand things in context, and very often over a back-and-forth that requires much more than understanding just one sentence at a time. The current natural language understanding technology out there is actually more of a one-isolated-utternace understanding technology. Whatever context can be used by Siri and others is rudimentary at best. It’s like isolated natural language islands duct-taped together with code, which still leaves us trying to learn how machines understand things, instead of the other way around.

Truly natural and effective communication with machines requires a level of understanding beyond single utterances. That’s where KITT.AI’s conversation modeling and understanding engine comes in. It’s a multi-turn conversational engine that understands not just what you can say in a one sentence request, but what you really want to convey in the way you communicate. This summer we will release ChatFlow, a platform that allows developers to create effective, smart, truly conversational bots easily, delivering user experiences that will change the way we think of and communicate with bots. Personally, I’d be willing to give every airline a really good deal to leverage our platform, just so that I never have to speak with one of their current bots again!