Creating Testable Bots by Separating Concerns

In this chapter we'll look into testing bots and see how it's possible to increase the testability of bots via a clean separation of concerns.

Testing a bot

Let's say we're building a chatbot which allows us to play music.

const { musicService } = require('./services');

const bot = async (req, res) => {
    const artistResult = /play (.*)/i.exec(req.text);

    if (artistResult) {
        const artist = artistResult[1];
        await musicService.play(artist);
        res.send(`Playing ${artist}`);
        return;
    }

    const randomResult = /play random/i.test(req.text);

    if (randomResult) {
        await musicService.playRandom();
        res.send(`Playing random music`);
        return;
    }
}

Now let's test it. We need to create a mocked version of res so that we can easily examine the results of send, and create stubs for our musicService with Sinon.JS to ensure our tests don't actually call the service:

require('mocha');
const { expect } = require('chai');
const { musicService } = require('./services');
const { createSandbox } = require('sinon')

describe("bot", () => {
    const tests = [
        ['play artist', ['Playing artist']],
        ['play random', ['Playing random music']],
    ];
    const sandbox = createSandbox();

    beforeEach(() => {
        sandbox.stub(musicService, 'play').resolves();
        sandbox.stub(musicService, 'playRandom').resolves();
    });

    afterEach(() => {
        sandbox.restore();
    });

    for (const [text, expectedResponses] of tests) {
        it(`should respond "${expectedResponses}" when user says "${text}"`, done => {
            const actualResponses = [];

            // mock res to add each response to our array
            const res = {
                send: text => actualResponses.push(text)
            }

            bot({ text }, res).then(() => {
                expect(actualResponses).deep.equals(expectedResponses);
                done();
            });
        });
    }
});

The results look good!

bot
    ✓ should respond "Playing artist" when I say "play artist"
    ✓ should respond "Playing random music" when I say "play random"

  2 passing (11ms)

The problem with testing bot output

But these successful test results hide a problem. The test for playing a specific artist is matching when someone says play random.

We could (should?) fix this by looking at the call count for play and playRandom. While this would work for our relatively simplistic bot which contains two possible outcomes, each mapping to a single service call, this doesn't scale to anything remotely real world.

It's common for multiple calls to be made, and refactorings to occur. The former means lots of code to ensure the right functions are being called, and the latter requires cascading updates to all of the various utterances and logic paths we're testing. It won't take long before our test code becomes unwieldy.

Our situation is a direct result of the imperative style used to code bot - at some point a decision is made to take an action, the action is taken, and that's that. The function representing our bot is both determining what action to perform, and performing the action.

It's hard to test, which is often an indication our code could be improved.

Moving beyond testing output

There's a different approach, which is to split this into two steps:

One function which returns a result indicating which action to take
One function that performs that action

Our test code can separately test the first function, and the action that it should take. And our running code would simply execute both steps.

Let's create a class representing "an action to take":

class ActionReference {
    constructor (
        name: string,
        ...args: any[]
    ) {
        this.name = name;
        this.args = args;
    }
}

Now we can recode bot by starting with the decision making piece:

const botLogic = (req) => {
    const artistResult = /play (.*)/i.exec(req.text);

    if (artistResult) {
        return new ActionReference('playArtist', [artistResult[1]]);
    }

    const randomResult = /play random/i.test(req.text);

    if (randomResult) {
        return new ActionReference('playRandom');
    }

    // etc.
}

You may wonder why we created an ActionReference class when botLogic could have created an anonymous object literal to return, as demonstrated below:

if (artistResult)
    return {
        name: 'playArtist',
        args: [artistResult[1]];
    }

Spoiler: later on we will create functions that return other values and it's helpful to have a way to positively distinguish an ActionReference (by testing instanceof ActionReference) from non-ActionReference objects that might happen to include the same properties.

Note that botLogic only takes req as a parameter. In our refactor, botLogic only makes decisions about which action to take; it doesn't actually execute the action.

We'll recode our tests accordingly:

describe("botLogic", () => {
    const tests = [
        ['play artist', 'playArtist', 'artist'],
        ['play random', 'playRandom'],
    ]

    for (const [text, name, ...args] of tests) {
        it(`should call ${name}(${args.join(', ')}) when user says "${text}"`, done => {
            botLogic({ text }).then(result => {
                expect(result).deep.equals(new ActionReference(name, ...args));
                done();
            });
        });
    }
})

Our new test results look like this:

botLogic
    ✓ should call playArtist(artist) when I say "play artist"
    1) should call playRandom() when I say "play random"

  1 passing (11ms)
  1 failing

  1) should call playRandom() when I say "play random"

    AssertionError: expected { name: 'playRandom', args: [] } to deeply equal { Object (name, args) }
      + expected - actual

       {
      -  "args": [
      -    "random"
      -  ]
      -  "name": "playArtist"
      +  "args": [ ]
      +  "name": "playRandom"
       }

Now our test correctly spots our bug. Someone should really fix that thing.

All we need now is the final step - take the action. Here's one approach:

const botAction = async (res, action) => {
    if (!action instanceof ActionReference)
        return;

    switch (action.name) {
        case 'playArtist':
            await musicService.playArtist(action.args[0]);
            res.send(`Playing ${action.args[0]}`);
            return;

        case 'playRandom':
            await musicService.playRandom();
            res.send(`Playing random music`);
            return;

        default:
            throw `Unknown name ${action.name}`;
    }
}

Note that botAction takes res as an argument, but not req -- that's because all the arguments it needs are in result.args. botAction is not analyzing what to do, as botLogic handles that responsibility; botAction is solely responsible for performing the action indicated by ActionReference.

Now we can test each action individually. We might still need to resort to verifying the output, but that's fine, as we have solved the problem of "guess which code produced this output?" The code below does still assume a single service call for each action, but this could easily be updated as the calls grow and change.

describe("botAction", () => {
    const tests = [
        [['playArtist', 'artist'], ['Playing artist']],
        [['playRandom'], ['Playing random music']],
    ];

    for (const [[name, ...args], expectedResponses] of tests) {
        it(`should respond "${expectedResponses}" on ${name}(${args.join(', ')})`, done => {
            const actualResponses = [];
            const stub = sinon.stub(musicService, name).resolves();

            const res = {
                send: text => actualResponses.push(text)
            };

            botAction(res, new ActionReference(name, args)).then(done => {
                expect(actualResponses).deep.equals(expectedResponses);
                expect(stub.calledWith(...args)).to.be.true();
                done();
            });
        }
    }
});

These tests look like this:

botAction
    ✓ should respond 'Playing artist' on playArtist(artist)'
    ✓ should respond 'Playing random music' on playRandom()

  2 passing (11ms)

This shows (correctly) that our issue was with the logic determining which action to take, and not with the implementation of the action itself.

As promised, we have split bot into two functions -- one which returns a result indicating which action to take, independent of how that action is coded, and one which takes that action, independent of the decision making process which selected it.

We come full circle by combining the two:

const bot = (req, res) => botLogic(req).then(result => botAction(res, result));

This new bot is functionally equivalent to the original bot, but its concerns have been cleanly separated. Of course, there's nothing stopping us from integration testing bot using our original tests. The fact that the integration tests pass when the unit tests fail is a great example of the value of unit testing.

Note: some chatbot systems may not use cleanly separated req and res arguments -- that's too bad because it makes it harder to enforce this desirable separation of concerns.

Improving our code

This is progress, but we've introduced some new potential problems:

Putting every action into a giant switch statement is a little awkward.
botLogic can create ActionReferences whose names do not exist -- we'll only find out when we call botAction

Comprehensive tests will help, but defense in depth tell us not to rely on our tests.

It would be nice if botLogic could only create ActionReferences with specific names. We can do that by using an abstraction that creates ActionReferences for us, checking against a predefined list of functions. This also fixes the switch problem.

class ActionReferences {

    constructor(getActions) {
        this.getActions = getActions;
        this.reference = {};

        for (const name of Object.keys(getActions()))
            this.reference[name] = (...args) => new ActionReference(name, ...args);
        };
    }

    doAction(...args) {
        return (result) => {
            if (!result instanceof ActionReference)
                return;

            const action = this.getActions(...args)[result.name];

            if (!action)
                throw `unknown action ${result.name}`;

            return action(...result.args);
        }
    }
}

Now instead of creating a botActions function we create a (very readable) list of actions as functions:

const actions = new ActionReferences(res => ({
    async playArtist(artist) {
        await musicService.playArtist(artist);
        res.send(`Playing "${artist}"`);
    },

    async playRandom() {
        await musicService.playRandom();
        res.send(`I can't delete "${arg}"`);
    },
}));

Now when botLogic wants to create an ActionReference it does so using actions.reference, e.g.

return actions.reference.playArtist(result[1]);

What's nice about this is that it resembles a normal function call, but instead of executing playArtist, it returns a reference to it.

Any attempt to reference a nonexistent action will immediately throw an error:

// throws because actions.reference doesn't include a "goodDog" property
return actions.reference.goodDog();

Note TypeScript will catch incorrect method names and argument types at compile time.

Our revised botLogic looks like this:

const botLogic = async req => {
    const artistResult = /play (.*)/i.exec(req.text);

    if (artistResult)
        return actions.reference.playArtist(artistResult[1]);

    const randomResult = /play/i.test(req.text);

    if (randomResult)
        return actions.reference.playRandom();
}

Our revised bot looks like this:

const bot = (req, res) => botLogic(req).then(result => actions.doAction(res)(result));

Which we can simplify to:

const bot = (req, res) => botLogic(req).then(actions.doAction(res));

Our test for botLogic stays exactly the same, because botLogic is still just returning ActionReferences, it's just creating them a little differently.

We could reuse our actions test by replacing botAction with actions.doAction and/or we could individually test each action by calling e.g. actions.getActions(res).playArtist('artist).

Conclusion

In this section we saw how it's possible to increase the testability of bots via a clean separation of concerns. We created helper classes called ActionReference and ActionReferences which also makes our code more readable.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

1.testable_bots.md

1.testable_bots.md

Creating Testable Bots by Separating Concerns

Testing a bot

The problem with testing bot output

Moving beyond testing output

Improving our code

Conclusion

Next

Files

1.testable_bots.md

Latest commit

History

1.testable_bots.md

File metadata and controls

Creating Testable Bots by Separating Concerns

Testing a bot

The problem with testing bot output

Moving beyond testing output

Improving our code

Conclusion

Next