Tuesday 14 May 2019

The hidden features of Feature teams

I while back I used the metaphor of booking a meeting to describe some of the problems with planning and priorities in development teams which are looking after discrete areas of a large system, which are typically called platforms.

An alternative approach is to use feature teams, which allow us to comprise a team of everyone needed to deliver a feature. You can immediately see, we don't have as much co-ordination overhead because a single team can deliver a feature end to end. We just have to task a team to deliver something and they can, making it easier to prioritise.

There are several flavours of feature teams but it really boils down to 2 types - permanent or prioritised. Permanent is exactly that, the feature team are permanent and have a stable number of permanent people. Prioritised is a team formed to solve a particular problem which are then disbanded after they have delivered the feature.

From my experience, it is when we look at feature teams over a longer term so we start to see some problems and they are pretty difficult to solve.

In a prioritised team, the immediate problem is with how people work together. It is true we often see a bounce when we form teams as we are invigorated with a new challenge. We also know this does not last forever and we know teams go through a storming phase when they are new. It takes time to get used to working with new people before we can form ways of working and finally reach our collective potential.

Depending on the length of engagement of a feature we should be sensitive to this and what the people in that team are going through. They should be fully bought into joining and working in a feature team and ideally volunteered rather than were placed. This might be difficult to achieve in some organisations and we should expect some people will not be a good fit for prioritised feature teams.

It is a safe assumption that features cut across multiple systems so these people would be touching multiple systems in order to deliver the feature end to end. You can read of the many accounts of how this can be handled and it really comes down to disciplined engineers doing things 'the right way' for their context.

I know I have seen many of these in my career but I have seen more engineers who would also start to change systems outside of the feature requirements because they did not like it or are acting altruistically and leaving it 'a better place' (Boy Scout Principle). Getting the balance is incredibly hard which is why I think the success of changing ownership from discreet systems to features is  more to do with people than process.

We have to remember that software engineering is incredibly subjective so there is no one way to solve problems and coming to a consensus across 10s of engineers is hard. It's even harder with 100s of engineers, no matter how well meaning they are.

So we can mitigate some of these problems with a permanent feature team, which we can help through the forming stage and stabilise only once. They can form their own arrangements and start building things pretty effectively, getting that balance of engineering practices through evolving agreements.

There are many upsides to feature teams from a development perspective as they allow people to have exposure to multiple systems and architectures and work on differing problems which can often become stale in platforms. The counter is that without ownership you need some pretty well developed disciplines in your engineering teams that will prevent systems rotting from being constantly adapted to meet the requirements of features.

You would also need to be much better at moving people around since features typically need different systems to be delivered. So the permanent team probably is not so permanent unless you expect the team to learn the systems as they go, which might be going against the improvements in delivery you expect from less coordination.

It's operationally where I see many cracks and I call this the '2am problem'. Put simply, who do we call at 2am if there is a problem? The prioritised feature team may not be around anymore and although that problem is solved with a permanent feature team it may not be clear which team needs to be involved since you would need deep system knowledge in order to link the fault with the feature that requires it.

If we contrast with a platform team for a second. If a system had a problem, there is a clear owner for that system and that is who we call. In a feature team world, multiple features may have altered a system and when a fault occurs it might be quite difficult to know which team has the knowledge to resolve the issue.

This actually looks worse over time. With a permanent feature team delivering multiple features touching many systems they are now having to support those features as an ongoing concern if we are to preserve the very sensible mentality that the teams that develop the systems should also monitor and maintain them. Think also what this looks like to new members of the team where they have to learn how multiple features work end to end, which could well be more difficult than learning how a discreet system works and how that fits in with other systems.

You should also expect that any sort of centralisation will also be impacted as you need to allow feature teams to find their own path. If you start to force centralisation on feature teams then you again suffer from coordination problems which will slow them down again. We should expect and maybe even embrace duplication of effort as the same problems are solved several times over. To be fair this is something we also struggle with in platform teams too but I would expect it to be more pronounced in feature teams since they work across multiple systems.

A coping mechanism would be a centralised support function that we hand over support to. This feels like we are crashing out of devops thinking entirely which we have seen to improve stability and time to recovery of systems since they emphasise ownership and responsibility. It also hides the true cost as we burn time in handovers, documentation and operational procedures which extend out of the development cycle - essentially, you pay for it in the long term (OPEX vs CAPEX)

This feels like I am not a fan of feature teams, which is not true. They solve many of the coordination problems we see with platform teams but they introduce other problems. This is what forms the majority of my learning with scaled systems - you cannot win. With every change, you introduce some problems which you need to be aware of. Ultimately we have to decide which problems we would prefer to solve.