personal web log written by izabeera and dryobates

python testing bdd morelia

Behaviour Driven Development tools in Python

by dryobates

Behaviour Driven Development is great enhancement for Test Driven Development. It helps to look on project from user's perspective. Something that we often miss as programmers. In my opinion not all tools that we have are equally helpful in this.

In community there are generally two approaches to BDD tools. One type of tools is RSpec-like tools others are Cucumber-like tools.

Cucumber-like tools have separate feature description file that looks like user story and functions/methods are marked somehow to match those steps. Example from Ruby's Cucumber [1]:

System Message: WARNING/2 (<string>, line 8)

Cannot analyze code. Pygments package not found.

.. code-block:: cucumber

    Feature: Serve coffee
        Coffee should not be served until paid for
        Coffee should not be served until the button has been pressed
        If there is no coffee left then money should be refunded

      Scenario: Buy last coffee
        Given there are 1 coffees left in the machine
        And I have deposited $1
        When I press the coffee button
        Then I should be served a coffee

In RSpec-like tools feature description is part of a code. There's no separate pure text file. Instead feature description is mixed with code. Example from Ruby's RSpec [2]:

System Message: WARNING/2 (<string>, line 25)

Cannot analyze code. Pygments package not found.

.. code-block:: ruby

    RSpec.describe Order do
      it "sums the prices of its line items" do
        order =
        order.add_entry( =>
          :price =>, :USD)
        order.add_entry( =>
          :price =>, :USD),
          :quantity => 2
        expect( eq(, :USD))

RSpec-like tools are OK when your don't have to interact with non-programmers. But if your product owner is non-programmer then it more difficult to use RSpec as communication tool as was intended by BDD's authors [3].

On the other hand with Cucumber-like tools product owner can easily learn how to write a little formalized user stories and send you one file for each feature.

Then the only thing you have to do is to map steps into methods. What's important you don't have to map it verbatim. You can do more broad mappings (one method for many steps) as from your perspective different steps can do the same.

When you use RSpec on your own you can fall into another trap. Because code is so close you'll start thinking like a developer and try to write steps to be a little easier to implement. Then your you start doing architecture driven development or something other - not behaviour driven. It's OK as long as you still do what project owner really wants.

With separate text file you focus to work like a role from your feature file:

System Message: WARNING/2 (<string>, line 60)

Cannot analyze code. Pygments package not found.

.. code-block:: cucumber

   Feature: some feature
        In order to <goal>
        As a <role>
        I want <action>

You know that format, don't you? "As a <role>" is a key part. We have to start thinking like a <role>. I don't say that you can't do this with RSpec-like tools nor that you won't fall in trap with Cucumber-like tools but the latter make my mind more focused on being scenario writer xor programmer at given moment.

OK. So you know that I prefer Cucumber-like tools. The most often language I use is Python. What tools it can offer?

I've found this:

  • Behaviour
  • Freshen
  • Lettuce
  • Behave
  • Morelia


It offered RSpec-like integration of feature and code but it looks to be dead. This makes me more confident that that RSpec direction is not what we should follow.

Lettuce, Behave and Freshen

Freshen, Lettuce and it's younger cousin Behave are almost identical from outside. Like all Cucumber-like tools they allow you to create separate file for feature. In order to mark your function to be considered as a step they all use decorators.

Example from Freshen's github [4]:

System Message: WARNING/2 (<string>, line 98)

Cannot analyze code. Pygments package not found.

.. code-block:: console

    pip install freshen

System Message: WARNING/2 (<string>, line 102)

Cannot analyze code. Pygments package not found.

.. code-block:: python

    from freshen import *

    import calculator

    def before(sc):
        scc.calc = calculator.Calculator()
        scc.result = None

    @Given("I have entered (\d+) into the calculator")
    def enter(num):

    @When("I press (\w+)")
    def press(button):
        op = getattr(scc.calc, button)
        scc.result = op()

    @Then("the result should be (.*) on the screen")
    def check_result(value):
        assert_equal(str(scc.result), value)

Example from Lettuce's homepage [5]:

System Message: WARNING/2 (<string>, line 128)

Cannot analyze code. Pygments package not found.

.. code-block:: console

    pip install lettuce

System Message: WARNING/2 (<string>, line 132)

Cannot analyze code. Pygments package not found.

.. code-block:: python

    from lettuce import *
    @step('I have the string "(.*)"')
    def have_the_string(step, string):
        world.string = string

    def i_put_it_in_upper_case(step):
        world.string = world.string.upper()

    def see_the_string_is(step, expected):
        '''I see the string is "(.*)"'''
        assert world.string == expected, \
            "Got %s" % world.string

Example from Behave's documentation [6]:

System Message: WARNING/2 (<string>, line 151)

Cannot analyze code. Pygments package not found.

.. code-block:: console

    pip install behave

System Message: WARNING/2 (<string>, line 155)

Cannot analyze code. Pygments package not found.

.. code-block:: python

    from behave import *

    @given('we have behave installed')
    def step_impl(context):

    @when('we implement a test')
    def step_impl(context):
        assert True is not False

    @then('behave will test it for us!')
    def step_impl(context):
        assert context.failed is False

As you see the youngest brother does not use that ugly global variable (world in Lettuce, glc, ftc, scc in Freshen) anymore, but it's still quite similar. There are several problems with that libraries.

Integration with frameworks

All those libraries needs quite a lot of code to integrate into a little more complex environments then pure python files. For example to integrate with Django Freshen needs django-sane-testing library [7]. Lettuce has own application to integrate with Django which adds new command (harvest) to Django and adds ability to run it [8]. Behave provides recipes how to integrate with Django. [9]. Freshen is plugin for nose [10] so if you don't use nose as test runner you can't use it.

All of those tools force you to change your testing habits as their have nothing in common with unittests nor doctests. I have no idea how would it like integrating it with framework like Twisted which being asynchronous have special requirements :/

Step declaration

Other problem with Freshen, Lettuce and Behave is method they need that steps to be declared. Again with simple setup it works great. Create steps file and put steps into it. Problems arise when you have more complies setup. Let's use Django example again.

Consider that you have two applications. In both applications you write feature files and steps in them. What will happened when you have similar step's description in both feature files but you have to run different functions for them in each application? Behave would scream at you saying that you have to change one of the steps! With Freshen and Lettuce is much worse as it won't tell you that you have conflict. It would simply use first (or last) step that it has found.

I understand Freshen, Lettuce and Behave authors motivation. They want to provide some kind of sharing steps in order to complies with DRY rule. In case of Freshen and Lettuce that force me to watch on all feature files and steps in my whole project in order not make some mistake. That's is impossible in big projects.

Behave have a lesson from older brothers and is aware of that problem. But it force me to change steps that were written by product owner. OK, I could append numbers to steps in feature file and explain product owner that it's technical requirement but I don't think that's right way.

Integration with CI

That's probably not so important issue but because you have to adjust your CI to be able to read output from your regular unittest runner (builtin unittest, nose, py.test, trial or whatever you use) and custom output from Lettuce or Behave. They both support JUnit xml files. That's simplify tasks. But you still can't run both your unit tests and BDD tests together.


Morelia took different approach to the problem. Key motivation is not to invent another testing backend but to use familiar for every Python programmer unittests (which is based on xUnit framework commonly found in many other languages).

Example from Morelia's documentation [11]:

System Message: WARNING/2 (<string>, line 237)

Cannot analyze code. Pygments package not found.

.. code-block:: console

    pip install Morelia

System Message: WARNING/2 (<string>, line 241)

Cannot analyze code. Pygments package not found.

.. code-block:: python

    import unittest

    from morelia import run

    class CalculatorTestCase(unittest.TestCase):

        def test_addition(self):
            """ Addition feature """
            filename = os.path.join(os.path.dirname(__file__), 'calculator.feature')
            run(filename, self, verbose=True, show_all_missing=True)

        def step_I_have_powered_calculator_on(self):
            r'I have powered calculator on'
            self.stack = []

        def step_I_enter_a_number_into_the_calculator(self, number):
            r'I enter "(\d+)" into the calculator'  # match by regexp

        def step_I_press_add(self):  # matched by method name
            self.result = sum(self.stack)

        def step_the_result_should_be_on_the_screen(self, number):
            r'the result should be "{number}" on the screen'  # match by format-like string
            self.assertEqual(int(number), self.result)

Morelia uses standard TestCases and allows your preferred test runner to do the job. There's no problem with integration with frameworks, as you can use it's recommended way of running tests. There's also no problem with sharing steps. If you know how to DRY in Object Oriented program (and we all know because we do this day by day) then you have problem solved. Just call other methods/functions from your steps or use a mixin/inheritance/composition... Huh. We have a lot of tools in OO to DRY :)

Of course there's no problem with integration with CI if you can integrate your current unit tests.

First time I've seen Morelia I knew that it's architecture is a key to solve problem with sharing code in BDD. I'm probably biased to Morelia as after I seen how brilliant is idea behind it I start to hacking it. It still misses some bells and whistles that other mentioned here libraries have but it can be added in future. Freshen's, Lettuce's and Behave's architecture blocks them from solving step sharing problem in easy way.

Maybe you know some other Behave Driven Development tool for Python that also doesn't have mentioned limitations? I'd be happy to test them.

[3]BDD as communication tool
[7]Integration Django with Freshen
[8]Integration Django with Lettuce
[9]Integration Django with Behave
Jakub Stolarski. Software engineer. I work professionally as programmer since 2005. Speeding up software development with Test Driven Development, task automation and optimization for performance are things that focus my mind from my early career up to now. If you ask me for my religion: Python, Vim and FreeBSD are my trinity ;) Email: