Lean software development, using lead and cycle time to drive business change
In the previous article I have stated that Lean manufacturing methods cannot be applied literally to software engineering. If methods cannot be applied literally then let’s explore how they can be applied laterally.
Goals are still the same
Your business goals are still the same as in the manufacturing. Your business cares about delivering quality, innovative solutions to the customer as fast as possible. Speed is typically measured in lead time and cycle time. Lead time is a customer centric metric, while cycle time is an internal process metric. Lead time measures how long a customer has to wait to get something, and cycle time measures discrete units of work speed.
When I am talking about lead time, I am not talking about the speed of a single department getting work from point A to B. I am talking about getting from customer commitment (point A) to customers receiving their functionality and using it (point Z). In this section we will explore these two metrics.
Cycle time
Let’s go back to the car manufacturing where I have mentioned Takt time. Cycle time is different to Takt time. Takt time tells you how much time you have to complete a discrete unit of work given the amount of demand. Cycle time tells you how much time you have used to perform a unit of work. Cycle time can be measured at every level, what does that mean?
Let’s say you have entered a coffee shop, and you have finally reached the till to make that drink request. Cycle time starts from the moment the barista smiles (hopefully) and asks you for your coffee order and it ends when they shout your order or name. That time that was taken to make your drink is cycle time. You can also measure cycle time at lower levels, for example how long did it take to take your order, how long did it take to make your coffee and then what was the cycle time overall.
Coming back to the world of software engineering. Feature requests will have an overall cycle time i.e. days or hours to complete and this is measured from when someone starts to work on the feature request to feature request being deployed. Individual tasks to complete feature requests will also have cycle time. Task cycle time is important as task cycle time will ultimately dictate the overall feature request development cycle time.
Lead Time
This is the most confusing one when it comes to knowledge work. Lead time is how long work has taken from the moment it was committed to the queue. So, the overall time work has spent in the system from the time work was committed.
Let’s go back to the coffee shop example again. When you have entered the coffee shop you have committed yourself to the queue, the overall time it takes you from walking into that door to the moment you walk out with nice flat white is the lead time.
You might be thinking well there will be a lot of waiting around before I get to the counter and my cycle time starts! You are right, the cool thing is that in “standard work” you can predict how long you will wait in the queue. This is because there is an average cycle time. Let’s say it takes 1 minute give or take to make a cup of joe. So, if there are 4 people ahead of you, that is 4 people are work in progress, you know that you will get your coffee in about 4+1(+1 is for you) multiplied by 1 minute so 5 average minutes.
Going back to software engineering. Work gets committed normally in the “Sprint” so that is when the lead time clock actually starts. If you are using Kanban and work does not get swapped around all the time (stable queue) then lead time is measured from the moment it is committed to your Kanban board.
Issue is that in software engineering cycle time is volatile and this makes work forecasting less predictable. Figure 1 shows that ~25% of feature requests got delivered in 2.5 days, ~50% in 7 days, ~75% in 10 days and last ~25% in 15 days. So it seems that craft work can’t be standardised into an average unit of time like standard work can be (figure 2) and this is why out of the box Little’s Law can’t be used.
Craft production team
Three factors that make up your individual work in the team:
- Wait Time — This is when you are waiting around for some knowledge that you don’t have, decisions that you can’t make and finally you are waiting around for someone else to complete some work before you can start yours.
- Disruption Time — This is when you have to expedite some work, rework some work, corporate interruptions and mental health impact.
- Task Time — Finally, this is the actual work that you are doing, pure sitting down and getting things done.
Imagine you are working on your own on your own start-up. You will have very little wait and disruption time. You are on our own, you can make all of the decisions. Also, if you are lucky enough to work in a quiet environment you should experience very little or no disruptions. You get things done fast, your users are impressed with your company, new features just come out all the time. In this case you are in “your individual work” scenario.
This changes the moment you hire your first employee in your start-up. The moment you do that, you create an organisation, that means you have created a system. In the system work no longer gets done by a single individual, it gets done by many individuals. You as the founder are unlikely to feel much impact by hiring this new person (apart from the knowledge transfer burden), but if you are not careful your new employee will have to wait for your decisions, knowledge and task allocation. Their wait time will grow as they wait for you, and they will probably be disrupted by you. You will wonder why they are not as productive as you. It might be because they have not got enough autonomy to make decisions as they don’t know your values, so they don’t know what decisions to make on your behalf or maybe they are not getting enough clarity about the desired outcomes. Most people are not founders, they are the employees and sometimes they don’t see the “decision making framework” that founders use.
Siloed craft production teams
Now imagine that your company grew too quickly, and it ended up not following any agile process and instead it had departments of people per discipline. So, web devs in one department, API devs are in another department, you get the point. Each department will have their own backlog, which means everyone has their own lead time. On top of that all individuals will experience disruptions (team meetings, urgent requests, you know the drill) and there will be many handovers from one department to another. Work will also end up travelling backwards due to misunderstandings, some people would call this a waterfall organisation and it looks something like this:
If a customer has requested a “Hot Feature A” they will have to wait for a long time for this work to travel through this type of organisation. Actual task time for “Hot Feature A” might be 12 hours of work in total, however given all of the wait time and disruptions it might take up to 1 month before it gets shipped. There is a big difference between 1 month lead time and 12 hours task time. Your customer will not care about the 12 hours of task time, they will just care that you took 1 month lead time. Overall, in this type of organisation lead time for most work will be very high, fewer projects will be shipped, projects will very rarely go out on time and individuals will feel frustrated as there will be a lot of firefighting.
Value focused craft production teams
Now imagine that your company’s founder understood the importance of lead and cycle time and worked to remove as much wait, disruption and task time from the overall delivery process.
They have decided to sit people together for a finite amount of time to deliver certain features and projects. They have done this as they want to remove handovers, the amount of project management required, competing agendas, waiting for decisions, knowledge and organisational dependencies. They work as a team on one story at a time (as much as possible) and their main job is to push that one story through the system as fast as possible.
That story that took 1 month to deliver, in this new system will be delivered in 12 hours or even less. This is because you have removed all of the waiting around, disruptions (team lead and product owners act as defenders) and because this team is focused together, they can actually expose the unknowns faster, tame complexity, share their experience and share the burden of the work so they can actually deliver the work faster. This structure would look something like this:
Conclusion and Implications
Now we know that lead and cycle time consists of wait, disruption and task time and all of these factors get exacerbated by departmental teamwork.
Lead and cycle time give us something that agile methods on their own don’t. Ability to run experiments and measure actual productivity results. It is possible to form a hypothesis, let’s say that “writing unit tests will reduce lead time but will increase cycle time” and actually test this statement.
By using lead and cycle time it is even possible to logically test the impact of a proposed change and avert bad management decisions. If change cannot be logically tested then at least you will be able to measure the impact when changes do go wrong. Lead and cycle time gives managers a decision-making framework for process changes.