Making of a Smart Business Chatbot: Part 3

Published

Chatbots are a great way to interact with your customers in real-time and gain insights into your users. In this third part of the series on building smart business chatbots, we’ll use a JanusGraph-backed knowledge base to give our chatbot from part 1 and part 2 some utility.

We’ve reached the third part of our Building Smart Business Chatbots and now we’re going to use JanusGraph to give our bot the knowledge to go with the chat. We’ll use Watson Conversation to allow our users to search for articles that might match their interests and responds back in conversational form.

Let’s get started...

Graphing It Up

It always helps to have some data before we start coding things up, so let’s start by inputting some articles from the Compose blog into a new JanusGraph database. Follow the first few steps in our article on Markov Chains to spin up a JanusGraph Instance on Compose and get Gremlin up-and-running.

Once you have those going, we’ll create a new database for our articles:

gremlin> :> def graph = ConfiguredGraphFactory.create("composeblog")  
==>standardjanusgraph[astyanax:[10.189.87.4, 10.189.87.3, 10.189.87.2]]
gremlin> :> graph.tx().commit()  

Next, we’ll grab a few articles with various tags and topics from a few different authors. We’ll model them by using vertices for our authors, tags, and articles. We’ll use edges to represent the relationships between those vertices:

Let’s go ahead and start building out our graph. We’ll fill in 10 articles from the blog across 3 different authors. First, let’s add the authors:

gremlin> :> graph.tx().commit()  
==>null
gremlin> :> def john = graph.addVertex(T.label, "person", "name", "John O'Connor")  
==>v[4112]
gremlin> :> def abdullah = graph.addVertex(T.label, "person", "name", "Abdullah Alger")  
==>v[8208]
gremlin> :> def dj = graph.addVertex(T.label, "person", "name", "DJ Walker-Morgan")  
==>v[4208]
gremlin> :> graph.tx().commit()  
==>null

Next, we’ll add some tags from a sampling of articles. We’ll use the following sampling of articles to give us a good starting point, and we’ll pull the tags directly from those articles:

Let’s go through each of these articles and extract the relevant tags:

gremlin> :> def mongodb = graph.addVertex(T.label, "tag", "name", "mongodb")  
==>v[8304]
gremlin> :> def janusgraph = graph.addVertex(T.label, "tag", "name", "janusgraph")  
==>v[4232]
gremlin> :> def nodeRed = graph.addVertex(T.label, "tag", "name", "node-red")  
==>v[4304]
gremlin> :> def nodejs = graph.addVertex(T.label, "tag", "name", "nodejs")  
==>v[8328]
gremlin> :> def rabbitmq = graph.addVertex(T.label, "tag", "name", "rabbitmq")  
==>v[4152]
gremlin> :> def elasticsearch = graph.addVertex(T.label, "tag", "name", "elasticsearch")  
==>v[4184]
gremlin> :> def postgres = graph.addVertex(T.label, "tag", "name", "postgres")  
==>v[8280]
gremlin> :> graph.tx().commit()  
==>null

Now that we have our tags, we can input our articles along with their relationship between tags and authors.

gremlin> :> graph.addVertex(T.label, "article", "name", "Taking a Look at Robomongo and Studio 3T with Compose for MongoDB", “url”, “https://www.compose.com/articles/taking-a-look-at-robomongo-and-studio-3t-with-compose-for-mongodb/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Avoid Storing Data Inside "Admin" When Using MongoDB", “url”, “https://www.compose.com/articles/avoid-storing-data-inside-admin-when-using-mongodb/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Storing Network Addresses using PostgreSQL", “url”, “https://www.compose.com/articles/storing-network-addresses-using-postgresql/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Mastering PostgreSQL Tools: Full-Text Search and Phrase Search", “url”, “https://www.compose.com/articles/mastering-postgresql-tools-full-text-search-and-phrase-search/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "How to Script Painless-ly in Elasticsearch", “url”, “https://www.compose.com/articles/how-to-script-painless-ly-in-elasticsearch/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "MQTT and STOMP for Compose RabbitMQ", “url”, “https://www.compose.com/articles/mqtt-and-stomp-for-compose-rabbitmq”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Elasticsearch 5.4.2 comes to Compose", “url”, “https://www.compose.com/articles/elasticsearch-5-4-2-comes-to-compose”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Compose PostgreSQL powers up to 9.6", “url”, “https://www.compose.com/articles/compose-postgresql-powers-up-to-9-6/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Introduction to Graph Databases", “url”, “https://www.compose.com/articles/introduction-to-graph-databases/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Easier Java connections to MongoDB at Compose", “url”, “https://www.compose.com/articles/easier-java-connections-to-mongodb-at-compose-2/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Graph 101: Magical Markov Chains", “url”, “https://www.compose.com/articles/graph-101-magical-markov-chains/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Building Secure Instant API's with RESTHeart and Compose", “url”, “https://www.compose.com/articles/building-secure-instant-apis-with-restheart-and-compose/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Compose Tips: Dates and Dating in MongoDB", “url”, “https://www.compose.com/articles/understanding-dates-in-compose-mongodb/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "5-minute Signup Forms with Node-RED and Compose", “url”, “https://www.compose.com/articles/5-minute-signup-with-node-red-and-compose/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Mongo Metrics: Calculating the Mode", “url”, “https://www.compose.com/articles/mongo-metrics-calculating-the-mode/”)

gremlin> :> graph.addVertex(T.label, "article", "name", "Building Secure Distributed Javascript Microservices with RabbitMQ and SenecaJS", “url”, “https://www.compose.com/articles/building-secure-distributed-javascript-microservices-with-rabbitmq-and-senecajs/”)

gremlin> :> graph.tx().commit()

Finally, let's add edges between our articles, authors, and tags so our graph is complete. At this point, you are entering quite a bit of data and, if you're using JanusGraph on Compose, your session might have timed out. Rather than using variable names to add edges to vertices like we did in the previous article, you can access them directly through the traversal object:

gremlin> :> def john = g.V(4112).next();  
==>v[4112]

Where the number inside of g.V() is the ID of the vertex. If you're not sure what the ID is of the vertex you're looking for, you can use the valueMap() method to figure it out:

gremlin> :> g.V().has(T.label, "person").valueMap(true);  
==>{name=[Abdullah Alger], id=8208, label=person}
==>{name=[John O'Connor], id=4112, label=person}
==>{name=[DJ Walker-Morgan], id=4208, label=person}

First, we'll add edges between each of our articles and authors. Since we may have disconnected by now, we'll use the id of each article to add the edges. We can find that by using the following command:

gremlin> :> g.V().has(T.label, "article").valueMap(true);  
==>{label=article, id=45168, name=[Building Secure Distributed Javascript Microservices with RabbitMQ and SenecaJS], url=[https://www.compose.com/articles/building-secure-distributed-javascript-microservices-with-rabbitmq-and-senecajs/]}
==>{label=article, id=12496, name=[MQTT and STOMP for Compose RabbitMQ], url=[https://www.compose.com/articles/mqtt-and-stomp-for-compose-rabbitmq]}
...

Use the command above to find the ids of your articles, and remember to change the ID in the g.V() command below with the id of those article you want to add an edge to:

gremlin> :> def john = g.V(4112).next();  
==>v[4112]
gremlin> :> g.V(36976).next().addEdge("author", john)  
==>e[he6-sj4-5jp-368][36976-author->4112]
gremlin> :> g.V(41072).next().addEdge("author", john)  
==>e[hse-vow-5jp-368][41072-author->4112]
gremlin> :> g.V(4120).next().addEdge("author", john)  
==>e[1z7-36g-5jp-368][4120-author->4112]
gremlin> :> g.V(16472).next().addEdge("author", john)  
==>e[7ij-cpk-5jp-368][16472-author->4112]
gremlin> :> g.V(20568).next().addEdge("author", john)  
==>e[7wr-fvc-5jp-368][20568-author->4112]
gremlin> :> def abdullah = g.V(8208).next()  
==>v[8208]
gremlin> :> g.V(12496).next().addEdge("author", abdullah)  
==>e[4re-9n4-5jp-6c0][12496-author->8208]
gremlin> :> g.V(12400).next().addEdge("author", abdullah)  
==>e[i6m-9kg-5jp-6c0][12400-author->8208]
gremlin> :> g.V(24688).next().addEdge("author", abdullah)  
==>e[iku-j1s-5jp-6c0][24688-author->8208]
gremlin> :> g.V(16496).next().addEdge("author", abdullah)  
==>e[iz2-cq8-5jp-6c0][16496-author->8208]
gremlin> :> g.V(20592).next().addEdge("author", abdullah)  
==>e[jda-fw0-5jp-6c0][20592-author->8208]
gremlin> :> g.V(8400).next().addEdge("author", abdullah)  
==>e[55m-6hc-5jp-6c0][8400-author->8208]
gremlin> :> def dj = g.V(4208).next();  
==>v[4208]
gremlin> :> g.V(12496).next().addEdge("author", dj)  
==>e[5ju-9n4-5jp-38w][12496-author->4208]
gremlin> :> g.V(32880).next().addEdge("author", dj)  
==>e[jri-pdc-5jp-38w][32880-author->4208]
gremlin> :> g.V(28784).next().addEdge("author", dj)  
==>e[k5q-m7k-5jp-38w][28784-author->4208]
gremlin> :> g.V(12376).next().addEdge("author", dj)  
==>e[8az-9js-5jp-38w][12376-author->4208]
gremlin> :> g.V(12424).next().addEdge("author", dj)  
==>e[4cx-9l4-5jp-38w][12424-author->4208]
gremlin> :> graph.tx().commit()  
==>null

Now, we'll add an edge for each of our topics.

gremlin> :> g.V(45168).next().addEdge("topic", rabbit)  
==>e[odxce-yuo-28lx-37c][45168-topic->4152]
gremlin> :> g.V(45168).next().addEdge("topic", nodejs)  
==>e[odxqm-yuo-28lx-6fc][45168-topic->8328]
gremlin> :> g.V(12496).next().addEdge("topic", rabbit)  
==>e[odxcq-9n4-28lx-37c][12496-topic->4152]
gremlin> :> g.V(12400).next().addEdge("topic", mongodb)  
==>e[ody4u-9kg-28lx-6eo][12400-topic->8304]
gremlin> :> g.V(36976).next().addEdge("topic", janus)  
==>e[odyj2-sj4-28lx-39k][36976-topic->4232]
gremlin> :> g.V(32880).next().addEdge("topic", mongodb)  
==>e[odyxa-pdc-28lx-6eo][32880-topic->8304]
gremlin> :> g.V(41072).next().addEdge("topic", mongodb)  
==>e[odzbi-vow-28lx-6eo][41072-topic->8304]
gremlin> :> g.V(24688).next().addEdge("topic", elastic)  
==>e[odzpq-j1s-28lx-388][24688-topic->4184]
gremlin> :> g.V(4120).next().addEdge("topic", mongodb)  
==>e[odxc3-36g-28lx-6eo][4120-topic->8304]
gremlin> :> g.V(28784).next().addEdge("topic", janus)  
==>e[oe03y-m7k-28lx-39k][28784-topic->4232]
gremlin> :> g.V(16496).next().addEdge("topic", postgres)  
==>e[oe0i6-cq8-28lx-6e0][16496-topic->8280]
gremlin> :> g.V(12376).next().addEdge("topic", postgres)  
==>e[odxcb-9js-28lx-6e0][12376-topic->8280]
gremlin> :> g.V(16472).next().addEdge("topic", mongodb)  
==>e[odxqj-cpk-28lx-6eo][16472-topic->8304]
gremlin> :> g.V(20568).next().addEdge("topic", nodered)  
==>e[ody4r-fvc-28lx-3bk][20568-topic->4304]
gremlin> :> g.V(20568).next().addEdge("topic", mongodb)  
==>e[odyiz-fvc-28lx-6eo][20568-topic->8304]
gremlin> :> g.V(12424).next().addEdge("topic", elastic)  
==>e[odxch-9l4-28lx-388][12424-topic->4184]
gremlin> :> g.V(20592).next().addEdge("topic", postgres)  
==>e[oe0we-fw0-28lx-6e0][20592-topic->8280]
gremlin> :> g.V(8400).next().addEdge("topic", mongodb)  
==>e[odxqy-6hc-28lx-6eo][8400-topic->8304]
gremlin> :> graph.tx().commit()  
==>null

If you're paying close attention, you'll notice that I actually doubled-up on some of those topics. One of the most useful things about graph databases is the ability to model relationships as you discover them, rather than having to plan out these relationships ahead of time (as you would with a relational database). We're able to connect multiple topics to the same article simply by adding another edge to the article node.

Now that we have our graph put together, let's run a quick test by querying JanusGraph for all of the articles written by Abdullah:

gremlin> :> g.V(abdullah).in("author").values("name")  
==>Avoid Storing Data Inside 'Admin' When Using MongoDB
==>Taking a Look at Robomongo and Studio 3T with Compose for MongoDB
==>MQTT and STOMP for Compose RabbitMQ
==>Storing Network Addresses using PostgreSQL
==>Mastering PostgreSQL Tools: Full-Text Search and Phrase Search
==>How to Script Painless-ly in Elasticsearch

And for fun, let's see all of the articles with a topic of mongodb:

gremlin> :> def mongo = g.V().has("name", "mongodb").next()  
==>v[8304]
gremlin> :> g.V(mongo).in("topic").values("name")  
==>Compose Tips: Dates and Dating in MongoDB
==>Avoid Storing Data Inside 'Admin' When Using MongoDB
==>Taking a Look at Robomongo and Studio 3T with Compose for MongoDB
==>Building Secure Instant API's with RESTHeart and Compose
==>5-minute Signup Forms with Node-RED and Compose
==>Easier Java connections to MongoDB at Compose
==>Mongo Metrics: Calculating the Mode

That looks about right - we can now ask JanusGraph to find all of the articles written on a particular topic or by a particular author. Now, let's see how we can bring these together by connecting JanusGraph up with our Node-RED application.

Connecting to JanusGraph from Node-RED

We've been building our chatbot with Node-RED hosted on Bluemix, and now it's time to connect our JanusGraph instance to it. The JanusGraph HTTP API can be used to execute gremlin queries using HTTP, so we'll try this out by using the HTTP Request node in Node-RED.

JanusGraph exposes a single HTTP POST endpoint to execute Gremlin queries. The endpoint expects a JSON-formatted document with a single key (gremlin) that has the value of your Gremlin query:

{
"gremlin": "YOUR_GREMLIN_QUERY_HERE"
}

This API is stateless which means that, unlike using Gremlin from the command line, we won't be able to use variables across commands. We'll also need to open the graph each time we want to use it (remember, the graph and g we used previously won't be available to us.

Connecting to the API is a two-step process: first, we'll need a session token we can use to authenticate our web calls. These tokens have a timeout of 60 minutes, so we'll need to refresh the tokens periodically. Once we have the token, we'll be able to send requests to JanusGraph with the token in the header of our call.

Generating a Session Token

First, we'll need to generate the session token. Let's start by just using a simple inject node to test our session token web call. Drag an inject node, an http request node, and a debug node onto the canvas. Double-click the http request node and give it a name of JG Auth. Wire them all up so they look like the following:

Then, double-click the JG Auth node to configure it with a method of GET and a URL using the connection string from the Gremlin using Token Authentication section of the Compose dashboard:

Wire them up, click deploy, and click on the button next to the inject node. You should see something like this in the debug panel:

{"token": "<your_token_here>"}

That's the session token you can now use to make requests to your JanusGraph instance.

Now, let's send a request using that token. Drag another inject node, http request, and debug node onto the canvas, and this time drag a function node onto the canvas as well. Double click each of them to name them, giving the http request node a name of JG Request and the function node a name of JG Query. Then, wire them up like the following:

Double-click the JG Query function node so we can add the token to the msg.header object and the query to our msg.payload object. We'll also configure our msg.url and msg.method here so we don't have to open the JG Request node, and we'll hard-code the token for now:

msg.headers = {  
    "Authorization": "Token <your_token_here>",
    "Content-Type": "application.json"
}

msg.payload = {  
    "gremlin": [
        "def graph = ConfiguredGraphFactory.open(\"composeblog\")",
        "def g = graph.traversal()",
        "def abdullah = g.V(8208).next()",
        "g.V(abdullah).in(\"author\").value(\"name\")"
    ].join(";")
}

msg.url = "https://portal2321-22.vigilant-janusgraph-45.jwo.composedb.com:17752";  
msg.method = "POST"

return msg;  

Notice that we had to basically string together all of the variables that we were using before. That's because the HTTP API for JanusGraph is stateless and does not remember the variables sent to it. We'll have to do this every time we want to make a call.

Wire them together and click on the inject node to see your JanusGraph query come to life:

{
"requestId":"5bac1a3d-9e37-4658-9001-496fc86e0f62",
  "status":{
    "message":"",
    "code":200,
    "attributes":{}
   },
"result":{
   "data":[
      "Avoid Storing Data Inside 'Admin' When Using MongoDB",
      "Taking a Look at Robomongo and Studio 3T with Compose for MongoDB",
      "MQTT and STOMP for Compose RabbitMQ",
      "Storing Network Addresses using PostgreSQL",
      "Mastering PostgreSQL Tools: Full-Text Search and Phrase Search",
      "How to Script Painless-ly in Elasticsearch"
   ], 
   "meta":{}
 }}"

Eventually, we'll want to implement these into a flow that can be used to send JanusGraph calls while automatically refreshing the token if the token is valid, but for now this should be sufficient.

Making Conversation

Our last step in this process is to use Watson Conversation to build a dialog of conversational commands that our users can send to traverse our graph of articles. The dialog should allow the user to ask questions of the bot and get data back in a usable form. For example, a question like "How many authors are there?" Would yield a response of "3".

We'll start by creating a couple of new intents. The first, called find, will allow us to query for articles either via topic or via author. The second intent, called list will list the authors and categories so the user knows what to look for.

Click on the Create New button on the intents tab and type #find as the name. Then, add a few phrases that can be used to trigger that intent. You can choose any you'd like:

Do the same thing again, but name the intent list and include phrases that could be used to display a listing of items.

Next, we'll add a few entities to help us qualify our find intent. When a user asks us to find articles, they can ask for articles BY a specific author or ABOUT a specific topic. We'll code these up as an entity called queryType and add some values which will add context to the conversation. Click on the entities tab and click Create new, then add some values and synonyms:

Now, we'll add some dialog to help our conversation respond properly to our chatbot. Click on the Dialog tab and click Add Node to add a new dialog entry point. Name it Look For Topic and, under the if bot recognizes tab, type find. We'll also need to qualify this search, so add another condition by clicking on the + button next to the find condition to add another condition. The second condition contains the queryType:about entity, which will add information to the response that goes back to Node-RED. In the Then Respond With field, add a response of Here's what I found about:.

Let's go ahead and test our dialog - go into slack and type the following:

show me articles about mongodb

You should get a response like the following:

Here's what I found about

Responding with Real Data

Now that we have the dialog up and running, it's time to use that conversation to query JanusGraph. First, let's take a look under the hood at what Watson Conversation is sending to Node-RED. The response generated by our first conversation phrase looks like the following:

{
   "intents":[
      {
         "intent":"Find",
         "confidence":0.9929267883300781
      }
   ],
   "entities":[
      {
         "entity":"querytype",
         "location":[
            17,
            22
         ],
         "value":"about",
         "confidence":1
      }
   ],
   "input":{
      "text":"show me articles about mongodb"
   },
   "output":{
      "text":[
         "Here's what I found about"
      ],
      "nodes_visited":[
         "Look For Topic"
      ],
      "log_messages":[

      ]
   }
}

The response has several structures we can use to tailor our final text response. We'll keep it simple for now and assume that, if our intent matches the find intent, that the topic is the final word in the input. In Node-RED, add a new function node connected to the output of the conversation node and give it a name of select intent. We'll also increase the number of outputs to 2.

In this function, we'll check the intent field and, if the find intent was triggered, we'll extract a topic from the users' input. For now, we'll make the assumption that the topic is simply the last word in the users' input. We'll also save off the output text from Conversation so we can use it later.

Copy the following code into the function node:

msg.dialog = msg.payload.output.text[0];

if (msg.payload.intents[0].intent == "Find") {  
    var input = msg.payload.input.text.split(' ');
    msg.topic = input[input.length - 1];
    node.send([msg]);
} else {
    node.send([null, msg]);
}

Now, copy the JG Query and JQ Query Request nodes we used earlier and paste them near the output of our select intent function. Then, wire them up to the first output.

Next, we'll want to update the query to use the topic we just saved in the select intent node. Double-click the JG Query node and replace the code in it with the following:

msg.headers = {  
    "Authorization": "Token <your_auth_token>",
    "Content-Type": "application.json"
};

msg.payload = {  
    "gremlin": [
        "def graph = ConfiguredGraphFactory.open(\"composeblog\")",
        "def g = graph.traversal()",
        "def mongoArticles = g.V().has(T.label, \"tag\").has(\"name\", \""+msg.topic+"\").in()"
    ].join("\n")
};

msg.url = "<your_janusgraph_url>";  
msg.method = "POST";

return msg;  

Finally, make sure your JG Query Request node is returning parsed JSON. Double-click it and select a parsed JSON object from the return drop-down. We should now be able to send a message to our Slackbot and turn that directly into a JanusGraph query.

When we execute the query and inspect the response, we get something like the following:

{
  "requestId":"9a0ed1f5-0da5-4ac2-b47a-795cb85ef9fb",
  "status":{
    "message":"",
    "code":200,
    "attributes":{}
},"result":{
    "data":
      [{
        "id":4120,
        "label":"article",
        "type":"vertex",
        "properties":{
          "name":[{
            "id":"16r-36g-1l1",
            "value":"Compose Tips: Dates and Dating in MongoDB"
          }], "url":[{
            "id":"1kz-36g-4qt",

  "value":"https://www.compose.com/articles/understanding-dates-in-compose-mongodb/"
}]}},
....

Using this output, we can take the results of the query and format them into a response. Drag a function node onto the canvas and name it Format Slack Message. Wire it to the output of the JG Query Request node.

In this function, we're going to loop through all of the articles and format them as Slack links so the user can simply click on the articles directly in Slack. Paste the following code into your formatting function:

msg.conversationResponse = msg.payload;  
msg.headers = {  
    "content-type": "application/x-www-form-urlencoded"
}

var articles = msg.payload.result.data;  
var articleMap = [];  
for (var i = 0; i < articles.length; i++) {  
    articleMap.push("<" + articles[i].properties.name[0].value + "|" + articles[i].properties.url[0].value + ">")
}

var returnMessage = msg.dialog + " " + msg.topic + "\n" + articleMap.join('\n');  
msg.payload = {  
    "token": "<your_slack_token_here>",
    "channel": msg._payload.event.channel,
    "text": returnMessage,

};


return msg;  

Finally, let's grab the http response node that posts messages to slack (we'll name this Slack Request), drag it over to the output of our Format Slack Message function and wire them up.

We've now come full circle and, if everything is working properly, you should be able to get responses like this:

Wrapping it Up

While there are a few more things we can do to add polish to our chatbot, such as automatically renewing our JanusGraph session token, this series serves as a good jumping off point for those interested in building a chatbot for your business.


If you have any feedback about this or any other Compose article, drop the Compose Articles team a line at articles@compose.com. We're happy to hear from you.

attribution Scott Webb

John O'Connor
John O'Connor is a code junky, educator, and amateur dad that loves letting the smoke out of gadgets, turning caffeine into code, and writing about it all. Love this article? Head over to John O'Connor’s author page to keep reading.

Conquer the Data Layer

Spend your time developing apps, not managing databases.