Universal Natural Language User Interfaces and Seamless Workflow Automation
Co-written with Mike Dimitroff, CTO of RigD.io
Chatbots as UI — what we have so far
Do you need a bot to translate for you? One that is “fluent in over six million forms of communication” could certainly come in handy sometimes. Handy enough to put up with its… shortcomings.
But if you need to “interpret the entire Imperial network”, and the only interface you have is a wall socket, a bot is pretty much your only choice.
C-3PO and R2D2 are not really all that different: both understand human intents and translate them to other entities that don’t, like cantankerous aliens or Imperial computers, thus allowing humans to focus on fighting evil, rather than waste time studying obscure dialects and computer protocols.
Back in our own space/time, what do you do when you forget (or never knew) certain command (say, how to remove an image in Docker)? You take to your favorite search engine and type a query. Then you scroll through the results until you find one that seems like it would do the job, copy, paste, and hope for the best. You learned something and applied it in practice, right? Certainly, but was that the fastest, most efficient way to get the desired results?
Here is what you had to do in order to communicate to Docker your intent to delete an image:
- communicated an intent to learn the command to the search engine
- used your own intelligence for a quick “1 vs. all” classification (discarded advice on how to get rid of longshoremen photographs and similarly useless to you in this context instructions, zeroed in on the right article)
- Having thus learned how to communicate with Docker, you turned around and did just that:
Can you see anything wrong with this model? No? How about all the extra typing, the waste of time, the loss of privacy (now the search engine and 75,000 of its closest affiliates know that you’re using Docker, and are probably a good target for related products and services), the waste of bandwidth and CPU cycles, the waste of human intelligence? In bot-speak, Docker can discern maybe 30 or so intents, each with several slots. However, in order to figure out how to generate input in the exact format Docker requires, you engage a search engine capable of handling practically any intent in hope it will help you locate an article that will train you to translate your natural language version of those intents to Docker’s highly structured format. Wouldn’t it be nice if instead of you training yourself to deliver input in the exact way a piece of software wants, someone trained the software to expect input in the way you want to deliver it? Like your own, highly specialized search-engine-cum-translator that figures out exactly what you want to say and tells the software exactly what it wants to hear.
Chatbot UI as an integration and automation enabler
Bot interfaces are certainly nothing new. Sales, information, customer care, and all kinds of other bots have stepped in to free people from the tedious task of being the human language interface module of some computer. They perform their jobs with various degrees of proficiency, from a clunky CLI-like constructs to virtually indistinguishable from humans, and the users, for the most part seem to like the arrangement.
But there is one feature that most interface bots share: each of them is trained for a singular task. As superb as Lufthansa’s Mildred is at her job, I can’t ask her to help me with airport parking, nor is there a way for me as a user to extend her functionality with a real-time parking app. Custom-designed and trained as an interface to a specific system, bots are seldom multi-skilled or extensible. If the owner of certain platform has not done a very good job designing a bot interface for it, or doesn’t have one at all, the users don’t have many options.
Big, general purpose AI platforms like Alexa, Cortana, Google Assistant, etc. do offer extensibility and integration with numerous 3rd party services and APIs, but integration developers have very limited control over dialog options and virtually no say in training and NLU configuration. And again, for someone to develop an integration, they usually have to own either the bot or the back end — mere users can only hope and pray.
There are companies that offer “bots as a service” — easy-to-use bot-creation platforms. While BAAS dramatically lowers the threshold of entry for having a bot interface, its main use is deployment of singular task bots over specific, typically proprietary backends, much like their more sophisticated developed-from-scratch cousins.
What nobody seems to be doing at this point is create sophisticated natural language bot interfaces for well-known public services and APIs whose authors haven’t gotten around to bot-enabling their work or have not done a very good job of it. And even if they have, such platform-bot couplings exist in a silo, without any ability to collaborate with similar integrations. For example, wouldn’t it be nice if there was a bot that could retrieve all information about a Pager Duty incident, search the CloudWatch logs for specific information, share that information with the rest of the team, and once a fix is agreed upon, initiate a build, deploy the container images to ECS, and kick-off a Kubernetes rollout of the new release, all through a dialog right there on the team’s chat channel?
Beyond a mere convenience and efficiency, such a bot platform would enable a degree of seamless integration between the abstracted services. Once someone writes an integration with a backend service (AWS, Docker, Kubernetes, Heroku, Twilio), anyone can create activities that integrate this service and any other. What’s better, it wouldn’t take any coding, only talking to the bot.
Data retrieved from one service can be used as input for another, establishing information pipelines. Once the platform’s AI observes the way human users interact with various services, it can learn how to build such pipelines itself, and offer that skill to all its users.
On the channel side of the stack, while everybody likes to talk about “openness” and “federations”, cross-platform communication is like controlled nuclear fusion: possible, but nowhere near practical. One certainly doesn’t need AI to bridge that gap, and a number of solutions exist, like the excellent Franz. But the problem we’re trying to solve is not mere chatting, it’s collaboration between humans, AI, and the services abstracted by the Service Adapter, a task that requires taking full advantage of each channel’s API. The bot interface can create a nearly identical user experience across different channel platforms. It can enable users to interact efficiently with other users and the AI elements, creating a virtual unified collaboration environment that spans all supported services and channel platforms.
Conclusion and perspectives
What can a bot-UI platform do for its users?
For casual users: helps skip the web search part in the “decide-search-type-run” cycle. The way to convey user’s decision to the target system doesn’t have to be exact, like “create github repo named myRepo with description need a repo before you
can commit code” it can be vague: “make a git repo”, or even exploratory: “how do I make a repo?” There is certainly nothing wrong with looking stuff up, but the big search engines are general, so it takes some extra typing to describe user’s need, and also some extra reading to select the right answer and comprehend it. With NLU interfaces, as soon as we manage to convey which one of the limited number of intents we want, our work is done.
But that’s not all. New users can benefit from the experience absorbed by the platform from interacting with their more seasoned colleagues in the form of suggested slot values (especially hard-to-type ones) and automated sequences of commands — both human-created and machine-suggested.
As more users interact with the bot, it accumulates more training data in the form of both positive and negative examples of intent and named entity values, making it easier and easier to recognize the intents of even completely inexperienced users. Compare that to the classical UI, that is always equally hard when one starts out, and then gradually trains the users to interact with it by punishing them for mistakes they make.
For advanced users: helps memorize complex commands, bring them up easily, edit slot values, and generally “never type anything twice”. Allows the creation of batch sequences and scripts, at user’s request or as suggested by the bot itself. Seamless Automation is a phenomenon only possible in this environment: one doesn’t need to write scripts or even save sequences. If a collection of commands is repeated enough times by enough users, the platform will notice that and will offer it as shortcut, complete with arguments if appropriate, or elicitation dialog if needed.
The platform also frees senior engineers from having to train new users by leveraging the training data it has collected and the automation sequences and slot values that have accumulated since the team started using it — information that traditionally is shared at best through documentation, and at worst in 1:1 hand-holding training sessions.
For teams: helps avoid context switching during discussions. Often the easiest way to check something is to try it, but “trying” is an “offline” activity — it would take a human (or two) out of the discussion for way too long. Now you can tell the bot to do it, directly from the same chat medium you use for discussion. Imagine no more “let’s take this offline” or “let me get back to you with that”, and the subsequent “when would be a good time to talk” and “did you have a chance to look at that thing you said you’d look at?”
As an added bonus, from usability point of view, each UI method works best with a specific set of peripherals: a good keyboard for a CLI, a large, high-resolution screen for a GUI. A bot UI is the least demanding of them all, being perfectly content with a mobile phone. Good news for those of us who hate being tied to a computer.
As we move toward the post-app world, bots and natural language interfaces will become increasingly important. Fronting digital systems with sophisticated, friendly, and highly efficient bot-UI platforms makes not only technological and economic sense, but also offers societal benefits by flattening the learning curve of today’s hottest new services and facilitating automated training and on-boarding. It allows an entire new category of users to transition to the field of Information Technology, filling many of the chronically under-filled positions, making both workers’ and employers’ dreams come true. 🤖
Can’t wait for the bot? Try it out here.
Universal Natural Language User Interfaces and Seamless Workflow Automation was originally published in Chatbots Magazine on Medium, where people are continuing the conversation by highlighting and responding to this story.