The Conversational Interface is the New Paradigm

In 1962 Thomas Kuhn published The Structure of Scientific Revolutions. In it he posited that science moves forward with brief, dramatic episodes of revolution in the paradigms of thought followed by longer terms of assimilating and exploring these changes. A stepwise function if you will from revolution to revolution. One could say that the brief history of software is governed by a similar abstraction. From the era of the desktop app to the era of the web page to the era of the mobile app to the latest paradigm shift which seems to be happening now: the conversation.

As developers it behooves us to keep up, even if it just appeals to the "look it's new and shiny" which some of us have, with these dramatic changes. Certainly, the hype cycle in the short term will get to the point that the conversation bots or assistants or whatever the eventual designated name will be will overrun what is actually possible. Eventually, though this new paradigm like all of those before it will take a long period of time to work its way forward and move into many aspects of computing.

What follows is an example which is not even a toy app but we will carry it no further. The goal is to expose you to some of the differences which are currently apparent in this next revolution. It is still early and it is unclear who will win (Siri or Alexa or Facebook Messenger or some unrealesed thing from Google or ...) and what the ultimate ecosystem will look like. It does seem clear though that whoever does win they won't be able to do it all. No one company can write all of the desktop apps or all of the web pages or even all of the mobile apps. Conversational apps will be the same. We will end up with some provider(s) who will deliver the interface to the users either via a message line like Slack or WhatsApp or via voice like Siri and Alexa (both of these ultimately get turned into text lines too). These providers will most likely sit at the center of an ecosystem which will handle NLP (Natural Language Processing), semantic analysis, and other core tasks such as location and calendar integration. So, what will this leave? All of the niche domains provided by all of the many businesses and organizations in the world! It's a huge opportunity. Things like the following:

  1. Ask your local grocery store bot if they have an item currently in stock. e.g. @CurbMarket Do you have any local strawberries today?

  2. Tell a clothing merchant to notify you next time they have a big sale. e.g. @OakHallClothier Tell me when you have your next sale

  3. Use a service to estimate when auto maintenance is due. e.g. @autobot i have a 2011 Toyota Highlander with 48000 miles. Tell me when my next oil change is due.

Botkit

There are many tools for bots today with new ones arriving, some fading and others "on the horizon". Currently, there are "bits and pieces" for particulars like dialogs (IBM Dialog) and NLP (IBM AlchemyAPI) all the way to large sdk's for voice and digital assistants (Alexa, Siri, and Google). This non comprehensive list points to a few facts about this current space of chatbots. It's early and there is a large scope of investment occurring. While all of these warrant investigating if you are interested in this space, the easiest entry currently is a project called Botkit. It's an open source Javascript library built by the folks at howdy.ai with some assistance from the folks at Slack. It runs as a Node server which can connect via a socket to Slack's Realtime API or it can even handle webhooks from Slack, Facebook, and Twilio. Botkit provides a simple framework to handle the basics of creating a chat application.

Starting with Slack's Realtime API

Slack in some ways is the simplest and arguably most useful of the current platforms. Many teams use Slack with some basic integrations on a daily basis. Many of these bots appear as users inside of Slack and have an online presence in a channel at the same level of a user.

 

It is very easy to connect a bot once you have a token from Slack:

var Botkit = require('botkit');

if(!process.env.token) {  
  console.log("Must set slack token in env.");
  process.exit(1);
}

var controller = Botkit.slackbot({  
  debug: false
});

controller.spawn({  
  token: process.env.token
}).startRTM(function(err) {
  if(err) {
    throw new Error(err);
  }
});

The controller above is the core driver that creates the direct connection to Slack via a socket. Then once the bot is connected it can listen for many types of events such as a direct_message or mention or even more basic things like rtm_open and user_channel_join. Often though we just want the bot to hear certain things and react to them:

controller.hears(['hello','hi'], ['mention'], function(bot, msg) {  
  bot.reply(msg, "yello");
});

The above does just that. It registers to hear hello or hi when the bot is mentioned and then it fires the callback which in this case just replies with a yello. In essence, we just performed the hello world of building and integrating a bot with Slack.

A Conversation

While hello world is nice, a modestly complex interaction such as step by step conversation really isn't that much more difficult:

controller.hears(['what', 'you'], ['mention'], function(bot,msg) {  
  bot.startConversation(msg, function(err, convo) {
    convo.say('I help you track vehicle maintenance.');
    convo.say('You tell me about your vehicle and how much you drive.');
    convo.say('then I\'ll keep track of things and notify you when it\'s time for maintenance.' );
    convo.ask('Would you like to know more?', [
        {
          pattern: bot.utterances.yes,
          callback: function(res, convo) {
            convo.say("just tell me to 'add' so I can ask you a couple of questions");
            convo.next();
          }
        },
        {
          pattern: bot.utterances.no,
          callback: function(res, convo) {
            convo.say("awww");
            convo.next();
          }
        },
        {
          default: true,
          callback: function(res, convo) {
            convo.repeat();
            convo.next();
          }
        }
      ]);
  })
});

Once again you register a top level handler with controller.hear. It listens for what and you with the bot's name mentioned. When that is heard the callback will fire. In this instance it is the bot.startConversation that is most interesting because it starts a stateful flow with that particular user. Typically, this is the kind of construct which can be used to gather information for whatever it is that your app provides to your user. Analagous in some ways to an HTML form yet this is more like a dynamic workflow.

The above example does little more than give some overview as to what this particular bot might actually do. It's like a help message for the user. First, it gives a brief overview with the convo.says then it asks a question. The ask can handle yes and no. If it doesn't get either it does the default and just asks again and again until it does get the yes or no so that it can continue. Truly, not very smart but still a start and a base from which many smarts can be built up.

A Multi Step Conversation

 

A Foundation to Build Upon

This example of creating a bot which has a presence that can react to textual messages is the foundation of this next revolution. While the examples above are simplistic they do provide some structure and a view into the basic text lines of voice and chat applications. These are the starting points for much more sophisticated applications. Botkit itself has support for plugging in middleware which can pre and post process messages. It would be normal to extend an application with functionality that does deep language analysis or some kind of machine learning to the recognize and trigger portions of the above. Throw in some user context of location and schedules and even some limited knowledge that a digital assistant might have about an individual and the possibilities become plentiful indeed.

  1. Botkit
  2. Hubot, an alternative from github
  3. Slack API
  4. Twilio messaging
  5. Facebook Messenger
  6. Alexa
  7. Code Example on Github
Image by: kaboompics