In this article, I will give a glimpse at how Behavior Driven Development changed the way I dealt with product as a Data Scientist. I will give some insight about why developers should care more about this methodology and what it is about.
1. Why Behavior Driven Development ?
1.a Quick overview of Project Management concepts
In the past, and even today, the Waterfall model was frequently used to initiate new projects. This project management paradigm follows a linear approach to project management that involves defining specific steps that will be completed sequentially.
In Waterfall, one completes one step prior to initiating the subsequent one.
- System: Creating the product requirement document.
- Analysis: Building models, schemas and business rules.
- Program Design: Buildings the project architecture.
- Coding: Developing the product in a technical way.
- Testing: Ensuring no defects are implemented in the product.
- Operations: Maintenance of the system.
This methodology strictly prohibits retroactive actions. It is frequently criticized due to its lack of adaptability. Indeed, it does not readily accommodate deviations or modifications from the initial plan. For more information, you can read this chapter from Scaling Software Agility: Best Practices for Large Enterprises, Dean Leffingwell, 2007.
To introduce increased flexibility in project management and reduce the feedback loop, we can invert certain steps of the process. Hence forming a cycle instead of a linear progression. Behavior Driven Development (Introduction to BDD, Daniel Terhorst-North, 2006) comes from this idea.
1.b Reducing the feedback loop: from Waterfall to Behavior Driven Development
Behavior Driven Development is a consequence of various improvements in the realm of project management.
First, the Testing step was moved before the Coding step, which lead to “Test-First-Programming” which allowed developers to avoid bug creation during the Coding process. Then Design was added into the loop which lead to “Test Driven Development”, a well known software paradigm. It allows developers to limit over-engineering their programming designs.
These reworks were tech centered and only concerned steps where developers were the only involved. The next revolution added the Analysis process into the loop leading to BDD. By adding the Analysis process into the loop, BDD allows project stakeholders to ensure the code being does match product requirements. Putting an emphasis on “What?” rather than on “How?” allows us to achieve this. By focusing on “What?” rather than “How?”, we drive the conversation on the value that we are bringing to the end user of our product instead of discussing technical aspects. Indeed, most of the time, everyone does not understand technical aspects well and they are subject to evolve. Focusing on the technical aspects later in the process is therefore preferred.
2. What is the heart of Behavior Driven Development?
2.a Key concepts of Behavior Driven Development
BDD articulates around 3 key concepts:
Collaboration is essential for successful project completion. Allowing product owners, testers, developers and end users to communicate in a clear manner encourages the team to focus on the desired outcome rather than technical implementation details. Thus, we create an environment where it is more natural to ask “What do we want?” over “How are we going to add this new feature to the code?”.
One should write scenarios in a clear and concise manner. They focus on the desired outcomes rather than outlining how the system should achieve them. The language used should be easily understandable. and describe the expected behavior of the system without detailing how it should be accomplished. Placing the emphasis on asking “What?” instead of “How?” remember?
- The Given When Then Structure
This structure provides a framework for the scenario. The "Given" portion of the scenario provides context or prerequisites necessary for the scenario to take place. The "When" section describes the action that will be undertaken by the product user. Finally, the "Then" part outlines the expected outcome after the user has completed their action.
2.b Behavior Driven Development: usage inside a project
It is obvious that understanding BDD concepts is essential if you want to practice it; however, one may still question how to apply them within a project.
Above is a diagram illustrating the workflow of a project implementing the Behavior Driven Development method. Upon detecting a new product requirement among users of the system, one adds a new piece to this workflow.
Product definition & Collaboration
For example, when developing a banking application that facilitates user account transfers, a new feature may be requested whereby users are able to make money transfers from their accounts using bank details. Following this, stakeholders of the project collaborate in order to produce acceptance criteria in BDD form: scenarios.
It is important to note that scenarios only provide an example of how the application can be used by the end user. They do not describe the technical steps necessary to implement a new feature. Again, focus on “What?” rather than “How?”. The purpose of this session involving Product Owners, Developers, and Testers is to identify how the team can add value to the end user by addressing one of their requirements.
From scenarios to tests
After writing scenarios, developers use them to write tests. Let’s take the example of an application build in python. The idea here is to use languages such as Gherkin to write scenarios inside .feature files and to use libraries such as behave to read these files with pattern matching to transform scenarios into tests.
With the above example (Figure 4), let’s say that we have a database containing the amount of money detained by users with their account information. An adaptation of the scenario above into a feature file could be:
A python code would then read this file, parse and apply pattern matching to it, transforming it into a test. For more information about this step, please look at behave documentation.
Finally comes the validation process of the workflow. If the team does not validate the feature, the team triggers a new session of acceptance criteria definition to better frame the new feature. If the teams validates it, the team pushes it into production and documents it.
3. Behavior Driven Development in data projects
As a data engineer and data scientist, I have observed that Behavior Driven Development is more straightforward to implement in web or front application driven projects than in data or AI driven projects. Indeed, when it comes to define a user story, describing the behavior of a user on a front app seems more direct than describing the behavior of a data ingestion pipeline or an AI. Still, I found that asking three specific questions facilitates the application of this methodology to data-driven projects.
3.a What product do I deliver ?
Let’s take the example of a decision helper dashboard that displays analytics and statistics to salespeople of a company for marketing purposes. What the front team will build is the dashboard visuals and functionalities such as exporting data, filtering, creating reports. But what the data team will deliver is a data mart that will feed the front team data and structure it for efficient querying purposes.
3.b Who do I deliver my product to ?
For the front team, it is clear that the end user, that is to say, the salespeople are the one that will use the developed product. On the other hand, the data team will deliver its product, the data mart, to the front team. Data Analysts that might have a deeper understanding of the data could also assess the quality of the product. Therefore, during the BDD rituals, stakeholders for both teams are different.
3.c Who is making decisions that will impact my product and will write my scenarios?
While the front team will receive specifications from the product owners and end users to change the color of a certain graph or add a filter, the data team will receive them from the front team and data analysts regarding the data model or query performances.
Therefore, while the front team scenarios will be more domain driven, the data team one might be more tech driven. One example of scenario flow could be:
- The end user detects a new filter to be added: As a user, I want to be able to filter my charts on a specific country.
- The front team and data analysts interpret this as a data need which is translated as: As a front developer, I have a country column in my KPI table.
For me, BDD is a pretty good methodology frame to put on a project if you want to be closer to the domain on which you are working. While I was first convinced that it was a front-driven methodology, I am starting to feel like it is clearly extensible to other fields of development.
To go further
In this article, I did not mention any specific method to make tests. As there are plenty of ways to test code, I will let you explore this vast world but I recommend this article from our blog in which we dig into Property Based Testing.
For data engineers like me who did not know how to make proper tests on libraries such as PySpark, I also recommend this quick read.
Sicara is firmly rooted in methodology and technology company . If you want to learn more about us, please feel free to contact us here.