Conversation Kit

A directed graph model for conversational UIs

Created as a spare time project by P. Daniel Tyreus - @tyreus

Minimal dependencies
Well documented
Clear, concise API designed to be extended and customized
Redux-style state management for predictability and easier testing.

Introduction

Conversation kit aims to provide a flexible structure for processing conversations
between a human user and a chat bots ands voice agents. This project
takes the approach of modeling a conversation as a directed graph.
The nodes of the graph (or vertices if you prefer) are the conversation snippets
spoken by the bot. The edges of the graph represent the flow of the conversation based on
the interpreted intent of user by connecting one node to another.

Below is an example of a specialized form of a directed graph conversation known
as a dialog tree. In this case each
node spoken by the bot requires a response from the user. Each edge directs the
conversation to the next node based on the response chosen.

Dialog Tree

Conversations can quickly get more complicated, with loops and multi-level flow. Below
is an example of the conversation graph for a chatbot to log allergy symptoms.

Conversation Graph

Installation

The artifacts are available on Maven Central

<dependency>
  <groupId>com.conversationkit</groupId>
  <artifactId>conversation-kit</artifactId>
  <version>2.0.1</version>
</dependency>

Directed Conversations

Conversation kit takes a more generalized approach to modeling conversations. A
DirectedConversationEngine starts with an initial state that specifies a start
node. The engine accepts a message from a user and delegates to a natural language
understanding (NLU) system to determine the user’s intent. The DirectedConversationEngine
then looks at all the outbound edges from the start node and picks
the first one that matches the intent and returns true for it’s validate() method. The engine
then proceeds to the target node for the matching edge and waits for the next user input.

//handle an incoming message from a user
MessageHandlingResult result = engine.handleIncomingMessage("hello").get();
//get the node that the conversation has progressed to
ConversationNode currentNode = index.getNodeById(engine.getState().getCurrentNodeId());

You would then check to see what action the current node is set to perform or
what messages it should send. This will depend on the node implementation, but
a simple example to get a list of messages for the bot to respond with might look like:

for (JsonValue message : currentNode.getMetadata().get("message").asArray()) {
    String m = message.asString();
    //send m
}

JSON Conversation Graphs

This project uses JSON Graph Format to store the
graph representation of conversations. Conversation Kit provides some classes
for reading JSON in this format and creating the internal graph representation.
You can use format you prefer by writing your own JsonGraphBuilder implementation.

Conversation State

The conversation state is a data store designed to persist a user’s progress
through a conversation, help customize the messages sent by the bot to the
user, and to save data from user responses during the conversation. In many
cases the implementation will be backed by a database or other permanent
storage.

Conversation Kit ships with an abstract IConversationState
implementation that shows
how the state can easily be stored using HashMap.

Redux

I writing and using version 1 of this framework I realized that state management
quickly became non-trivial in larger applications. While updating Conversation Kit
for version 2, I became inspired by a number of other projects
I was working on that were using Redux or a derivative for
state management. Redux is traditionally used in front end code to build graphical user
interfaces (GUI). At it’s core it features a predictable, centralized container for application
state. While a conversational user interface is somewhat different from a GUI, it
can still benefit tremendously from Redux-style store.

Redux in Java

Redux is primarily a JavaScript library and I could not find an implementation I liked
in Java. I suspect this is because Redux relies heavily on functional programming
concepts which were not as widely supported in Java when Redux was becoming popular. Java 8
has nice support for functional programming. Redux has a fairly small API, so I
wrote my own implementation
for this project. I may pull that out into a separate project at some point.

Typed State

The implementation I wrote is fairly consistent with the JavaScript version. The main
difference is that I wanted to use a typed state instead of a JavaScript object (or Java HashMap) for the
external API. This just means that the Store constructor must take an additional
argument of a Function that
accepts a HashMap and returns the typed state.

Application State

By default the Redux implementation only handles the conversation state. There is no
requirement for the rest of your application to interact with it. However, since it
is useful to have a centralized state, you can easily add additional state slices
and reducer functions to the store. See
ConversationGraphTest
for a complete example on how to construct a typed state with multiple reducers.

Nodes

A conversation node is a vertex on the directed conversation graph containing
content for the bot to present to the user. Each node has
zero or more outbound edges and zero or more inbound edges. The conversation
traverses the graph between nodes by looking at the user’s intent and choosing
the first matching edge at each vertex.

Each node contains a conversation snippet represents a small bit of dialog in a conversation.
In the case of a chat bot, this might represent a block of text sent as one
message. The content is stored in the metadata field of the node as JSON. The
structure will be highly implementation dependent and is designed to be completely
flexible. For a voice assistent, the metadata might store the link to an audio file.
For a Facebook Messenger bot, one node might hold the JSON representing a button and
another some text.

Creating a node from the JSON is the responsibility of a JsonNodeBuilder.

@FunctionalInterface
public interface JsonNodeBuilder<N extends IConversationNode> {
    public N nodeFromJson(Integer id, String type, JsonObject metadata) throws IOException;
}

In the case of a Dialog Tree, a
DialogTreeNodeBuilder takes JSON that looks something like
the following and creates a DailogTreeNode from it.

{
    "id": "1",
    "type": "DialogTree",
    "label": "1",
    "metadata": {
        "message": ["Hello I'm a test bot.", "How are you feeling today?"]
    }
}

DialogTreeNode

A dialog tree is a type of branching conversation often seen in adventure
video games. The user is given a choice of what to say and makes subsequent
choices until the conversation ends. The responses to the user are scripted
based on the choices made. A Dialog Tree would be a choice to model a
conversation when your UI does not allow free-form responses, like a
questionnaire.

A DialogTreeNode is a restricted implementation of
IConversationNode that
holds a text string to represent the displayed conversation snippet and
retrieves a list of allowed responses from the outbound edges. There is a
working example of how to model, build, and use a Dialog Tree in the
DialogTreeTest.

ConversationNode

A ConversationNode is a more general implementation of IConversationNode. Most
likely you will want to use or extend this for your node implementation. See
DirectedConversationEngineTest.

Edges

A conversation edge is a directed connection between two nodes on the
conversation graph. Each edge has exactly one start node and one end node,
but a node frequently has multiple outbound edges. The conversation
implementation will look at each outbound edge from a node in sequence to decide which
edge to use to continue traversing the conversation graph.

public interface IConversationEdge<S extends IConversationState> {
    public Integer getEndNodeId();
    public String getIntentId();
    public boolean validate(I intent, S state);
    public List<Object> getSideEffects(I intent, S state);
}

After the DirectedConversationEngine determines a user’s intent from a message,
it evaluates the outbound edges for the current node to move the conversation to
the next node. The engine iterates over each outbound edge to find the first to

Match the intent’s ID with getIntentId()
Return true from validate(I intent, S state);

Once a match is found, the engine dispatches the side effect actions to the
internal Redux store from getSideEffects(I intent, S state).

Validation Function

Use the validate function for cases where there are multiple edges with the same
intent or the intent requires preconditions in the state. For example, consider an
agent that takes food orders. There could be an intent to order a burger (i.e. ORDER_BURGER). The node could
have two outbound edges with the intent ORDER_BURGER, one to handle ordering a cheeseburger
and the other to handle ordering a hamburger. In this case, both statements

“I would like a cheeseburger.”
“Let me have a regular burger.”

would match the ORDER_BURGER intent. But the validate function on each edge could
look at the intent slots to only return true for the type of burger it is looking for.

Another way of achieving the same result in this case would be to have two different
intents ORDER_HAMBURGER and ORDER_CHEESEBURGER. But slot filling is a pretty
handy feature of most NLU engines and the validate function lets you apply logic to
the conversation flow based on slot values.

A second use case for validate is to make sure that a precondition in the state
is met before proceeding along an edge. For example, if there are no burgers ready yet,
the state could have a key burgers_ready:false. In this case all ORDER_BURGER edges
may want to return false for validate to direct the user to order something different.

Side Effects

Side effects are another concept borrowed from Redux. Side effects represent any
actions that should be taken by the application as a result of matching an intent
and moving the conversation to the next node. Side effects are dispatched to the
Redux store and should be an instance of Action. A common side effect is to update
the state with the results of the previous intent. From the above example, the edge
matching ORDER_BURGER might dispatch a CHEESEBURGER_ORDER_RECEIVED action. The reducer
could then update the state for the user from current_order:['fries','coke'] to
current_order:['fries','coke','cheeseburger'].

Any side effect that is an instance of Action is processed synchronously. In other
words the state will be updated before the engine proceeds to the next conversation
node. Conversation kit also supports asynchronous actions using the CompletableFutureMiddleware.
If the instance of the side effect is a Future, the action will be handled by the
middleware and processed asynchronously. This
is useful for longer running tasks that don’t necessarily need to finish to let the
conversation proceed.

Natural Language Understanding / Intent Processing

The edges on the conversation graph are matched to the interpreted purpose (i.e. intent) of
the user’s last message. Determining intent from an utterance is in the domain of Natural
Language Understanding (NLU). Conversation kit does not provide sophisticated NLU capabilities. Instead
it is designed to integrate with any external NLU service. To use any third party
NLU service, extend IntentDetector and pass an instance of
the class to the DirectedConversationEngine.

NLU Services

There are several well-known vendors who offer Natural Language Understanding as a service.

Microsoft Cognitive Services Language Understanding (LUIS)
Amazon Web Services Lex
Google Dialogflow
Facebook’s wit.ai

All of the commercial NLU systems use some type of advanced deep learning technology to
provide the language understanding. For testing and prototyping purposes,
Conversation Kit includes a RegexIntentDetector. This is a very primitive NLU that
just relies on RegEx matching to determine intent. The RegexIntentDetector is not
intended for production use.

For production use, conversation-kit has modules for Lex and DialogFlow.

Putting It All Together

For an example of a conversation graph with all nodes, edges, and side effects all loaded from a JSON file, see
ConversationGraphTest.

Directed Conversation Test

If you have questions or suggestions, you can contact me
on Twitter.