As an engineer, I’ve been working for a while for high tech giant like Microsoft or Amazon. I also worked a lot with SMB.
The first things I noticed as I joined those big structure is how slower big companies are compare to small structures when it comes to shipping Software.
After 5 years of observation, I think I can explain it with multiple reasons. First, the business is often running at a higher scale which force the technical bar to be higher: going through an operation readiness review at Amazon Web Services or a Gold Build Image Readiness at Microsoft Windows are definitively tough processes.
Second the technical requirements are often running at a higher scale too. Windows is 100+ millions of users, AWS is running huge internet websites.
Which mean the risk for the business and the customers are potentially a lot higher. Bigger risk means Bigger mitigation mechanisms.
But that’s not completely true! It can’t be only explained by the scale. If it could be explained by the scale and the risk for the customers, then why:
- Whatsapp was only 32 engineers for 900 million users
- Instagram was 13 employees for 30 million users
- Git has been implemented by one man, in 10 days (yeah ok fine, he is a genius, he doesn’t count)
- We can also speak about Docker, Minecraft, Buffer, etc…
It exists examples of small sized team that deliver high impact with high business/technical scale.
Why is big companies’ fat and slow if it is not only because of scale?
The Mythical man-month is a famous book that explains how nine women can’t have one baby in one month.
“Brooks’ law is a claim about software project management according to which “adding manpower to a late software project makes it later”. It was coined by Fred Brooks in his 1975 book The Mythical Man-Month.”
Adding more people means more communication and processes. Communication is SQRT(n) with n = the number of engineer.
At one point, you will reach a size where not only your velocity will plateau, but it will also start degrading.
Not only communication increase drastically, but your system will become more complex as it follows your organization’s structure:
organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations
In order to have a clear ownership, you will start to have complex data flow that will mirror your fat organization.
Usually we can identify 3 steps of a team that try to scale:
First step: the honey moon. More you add, more you see results. At first, we have one developer building his stuff alone, then we start to add few more to help her. At the beginning, we don’t see a huge improvement in velocity because now we need processes and communications, but still, as we add people we can see it helps shipping faster software.
Second step: we plateau. We grew so much that we now need to coordinate with big teams. Usually this is the time when you hear your CTO saying word like “Federation”, “Micro services” or “Platform” in order to move faster. And this is where my point is: she might be right, or she might be wrong.
Third step: 2 possible results, either the organization is federated so well that we now innovate at a higher pace. Either the organization is now in the middle of huge complex system with a lot of operations and the velocity is not worth the investment.
You could argue that this is WHY micro services exists: splitting the system in multiple pieces and have independent piece of software. Each piece of software is “micro” which mean a team of 5 can handle it.
Micro Services is clearly a solution that resonate well in my mind, and I clearly understand why someone would like to move forward that direction. The down side of those micro services is that it adds a lot of complexity and ego to your systems.
By complexity, I mean that in order to solve problems of ownership, the Micro Services sacrifice performances and data flow simplicity. Why? Because it is not following the first rule of distributed systems which is “do not distribute a system if you don’t need to”. [I’ve seen complex distributed systems serving millions of customers being rewritten in a monolithic system to remove complexity and gain better performances].
By ego, I mean now we have multiple teams with multiple identities, multiple roadmaps, multiples cultures, multiple standards and processes. They might look forward in the same direction for the big strategic goals, but tactically each teams will do whatever they want.
This is when things start to be tricky: we create artificial work. Work that each team will want to accomplish in order to make their life easier, to get promotion, to have fun etc…
I understand that Micro Service is a solution, but I think there is another simpler one: downsizing the team.
What is the perfect team size? And how to increase velocity of innovation if we can’t scale up the organization?
The solution I would like to offer is not to go for Micro Services. Not trying to build platform, but instead downsizing the team.
If your velocity starts to plateau, go backward, not forward in your hiring process. It means you reached the point where you had your ideal size.
The size will depend on the team culture, the type of projects. My opinion is that it should be around 10 to 30, not more. The usual sign of it is when you start to feel the need for an engineering manager.
Having a small team for an entire system allow you to do two big things that will keep the velocity high: it will allow you to not have low/mid management (see Peter’s principle), and to keep it simple.
Keep it simple means: don’t create a service if you don’t need to. Don’t re-create a database abstraction/aggregation layer if you don’t need to. Don’t add edge corner features that are complex if you don’t need to. Don’t use a fancy programming language to do crazy shit if you don’t need to. Don’t use the last technologies because it looks fun because you don’t need to. Don’t hire a manager if you don’t need to.
Keeping things simple is WHERE THE SOLUTION OF ALL SOFTWARE IS. Simplicity over complexity. Simplicity in our data structure, data flow but also our processes and organization’s structure. Simplicity is the key.
If you aren’t satisfied with the time it takes to build the next feature, and if you reach the “ideal” team size for a piece of software, what solution do you have? wait, just wait. I’m sorry to have to say that to you buddy, but it is not your money that will accelerate anything, it will make things get worst.
I wrote a blog post about technical debt, but I forgot to mention: creating a complex system, even if it is super well done, is a technical debt by itself.
I understand where the industry is leaning forward. Being agile, federated, using Micro Services looks really fancy. This is why I’m an engineering manager, I want to be part of this and I want to be wrong and see that those solutions are the keys to push our industry to the next level and increase the overall pace of innovation around the world.
But until I’m convinced, here is the lessons I’ve learn by myself:
A team of 5 can achieve as much or more as a team of 500 by keeping it simple (This is why a small startup can innovate faster than a huge company)
If you aren’t satisfied of the pace of your development and you are already a team of 10 for an entire system, either you hired the wrong persons, either you will have to learn to be patient.