What is the difference between a specification and a good specification?

Filed under: BDD, Cucumber, Executable specification, Requirements, — Tags: Behaviour-Driven Development, Gherkin, Good specifications — Thomas Sundberg — 2017-01-30

What makes one specification a bad specification and another specification a good specification? What is the fundamental difference between two specifications?

Software is special

One important difference between physical things and programs is that physical things are complicated to test. Physical things usually need special tools to make it possible to test them automatically. Otherwise they are tested manually.

Manual testing is expensive and a lot of work is done to automate testing. Perhaps even more in the physical world than in the world of computers.

Software is easy to test. It is also cheap to test. To run a test suite doesn't cost a lot of money.

The biggest problem is that so few developers write testable code. Writing testable code is hard and it is even harder if you start from the wrong end, adding the tests last. Writing testable code if you work test first is easier. Often really easy. Unfortunately, it is still a skill many developers yet have to learn.

A good specification, however, will support writing testable code and will act as an acceptance test.

Easy to understand

A good specification is a specification that isn't ambiguous. It is also including. Anyone who can understand the problem should be able to read the specification and say "Yeah, that's about right".

Including also implies that the reader should not have to learn a special skill, such as programming, to be able to understand the specification.

The language must be as natural as possible. The room for interpretation must be so small that it really doesn't exist. Concrete examples are nice because if they are valid, you can't argue with them. A good specification contains good examples that illustrate what should work when we are done implementing something.

Automatable

A good specification is possible to execute. If the execution passes, then you can assume that the specified and wanted behaviour is implemented and works.

A programming language can be used to specify the behaviour of a program in such a way that it is executable. But a programming language is not including. Just because you can read natural language doesn't mean that you can read code for a computer program.

Reading and understanding a programming language is a skill most people don't have. It is hard to read and understand a program where the core intent may be hidden behind a lot of details.

This means that we would like to have a format that anyone can read and understand. Most people are able to understand short examples. If we can specify the wanted behaviour so they are created from examples that are easy to understand, then we will be able to include much more people than if we would use a programming language. We want something that is more inclusive while still being executable.

Gherkin

A formal language called Gherkin fulfills the requirements that a good specification is:

Including - everyone can read and understand examples
Executable - it is easy to transform examples specified using Gherkin to a programming language and execute them

Good specifications can be expressed using Gherkin.

Gherkin examples

An example of Gherkin may look like this:

Feature: Refund item

  Scenario: Jeff returns a faulty microwave
    Given Jeff has bought a microwave for $100
    And he has a receipt
    When he returns the microwave
    Then Jeff should be refunded $100

It follows a strict formal format and is possible to execute. But it is human readable. I won't describe it, you can read for yourself.

This example has a few nice properties.

It exemplifies a rule, Jeff is allowed to return the Microwave. The rule as such isn't expressed, but there is an example of a situation where a return is allowed.
It doesn't give away any implementation details such as is this a web application or not. You can't tell from the example.

This example alone is not sufficient for creating a complete system for returning goods. But it exemplifies one thing that should work.

Another example of Gherkin may look like this:

Feature: Search feature for users
  This feature is very important because it will allow users to filter products

  Scenario: When a user searches, without spelling mistake, for a product name present in inventory. All the products with similar name should be displayed

    Given User is on the main page of www.shop.com
    When User searches for phones
    Then should the search page should be updated with the lists of phones

Is this a good or bad example?

I think it is a bad example. It is bad because

The scenario headline is long and rambling. It doesn't even fit my screen. I have to scroll to read it.
It is generic, it talks about User. It is probably a user that uses the application, but it is much nicer to talk about a person. Someone you can picture in front of you. Or create an image of and post on the team wall.
It tells me about the technology. I can understand that this is a web application. Information that is uninteresting to know if I want to understand how the system is supposed to work. It is easy to get lost among technical details about a web application when the interesting question is "Is this example valid?".
It has, maybe, incidental details. Is it important if the spelling is correct or not? What happens if the search criteria is misspelled? Should no result be returned? Do we need to implement a spellchecker to know when the search word is spelled correct? I don't think the spelling matters. And yet it is mentioned.
I wouldn't choose the Feature description that states "This feature is very important because it will allow users to filter products". The feature as such is created, that it is important is obvious for me. It wouldn't exist unless it wasn't important. This is perhaps not the main reason why I think this is a bad example, this just add to my opinion.

Good Gherkin and bad Gherkin

Gherkin is a nice, formal, language that can be used to create good specification. But as with any tool, it is possible to use in a bad way. It is possible to create really bad specifications using Gherkin. The difference between good and bad Gherkin is in the details.

Both examples above follow the same format. They are both relatively short, one is four lines long and the other is three. Clearly you can't use the length to determine if an example is good or bad. That is at least the case with short examples. Long examples, longer than say 5 - 7 lines Gherkin are usually not good because they are unfocused and contain much details about the execution rather than the desired behaviour.

One difference is that the first example is more concrete than the last example. The last example is a bit generic and generic is bad in this domain. Examples should be very concrete. Too generic examples are examples of bad specifications.

An example of bad Gherkin can be when you specify a web application and the examples you create talks about details seen on the screen rather than the desired behaviour. Screen details are important, but talking about a specific button or link does not describe the behaviour the user should experience. We don't see that in the web example above. But it would be esy to add and it would make the example worse.

The user is interested in carrying out a specific task, not navigating a web application. The example must therefore talk about the goal of the user and the business rules that can be applied. An example may be a user who wants to return an item they bought. There are rules regarding item returns that the application must support. The customer must be able to present a receipt. The item may not have been purchased a long time ago. These are examples of rules that are important and will remain the same if the application is a web application or a manual process in a store. Talking about screen details will obfuscate the important behaviour and focus on implementation details.

Trigger discussions

Examples may also trigger interesting questions such as: "How should the system behave when a customer wants to return an item but doesn't have a receipt?" Is it possible? It might be possible during some circumstances.

Good examples trigger interesting discussion about the behaviour the system should support. It is easy to loose sight of the behvaiour when navigation details are discussed.

Remember, software development is about learning. One way of learning is discussing a problem. Examples may be the best way to bring down the discussion from a very high altitude where it is easy to talk about generic behaviour and therefore be ambiguous.

Conclusion

A good specification is

Easy to understand
Executable
Acting as an acceptance criteria

Expressing examples using Gherkin doesn't make the examples good specifications. It is possible to be too generic when using Gherkin and therefore miss the opportunity to create a good, executable, specification.

Acknowledgements

I would like to thank Malin Ekholm and Alex Bolboaca for feedback and proof reading.

Resources

Gherkin - a language used to document behaviour
Cucumber Anti-Patterns
My other posts about Behaviour-Driven Development
Thomas Sundberg - author

(less...)

What is the difference between a specification and a good specification?

Software is special

Easy to understand

Automatable

Gherkin

Gherkin examples

Good Gherkin and bad Gherkin

Trigger discussions

Conclusion

Acknowledgements

Resources

Pages

Categories

Authors

Archives

Meta