How Code Storage Management Affects IT Product Competitiveness
The choice of repository affects the updates delivery speed. A poor model can severely hinder the market penetration of new features and stall development processes. In this article, we will explain difference between different kinds of repositories and why this is more about a team management than a technology itself.
Without going deeper into the details and thinking about hybrid options, let’s say storage is divided into two types:
- A mono repository is a single repository for all the product components. It includes libraries, business logic modules, interface, some service utilities, etc.
- Multi-repositories are distributed repositories for product components. In this case, the services are small and built individually, with small code checks.
Both approaches work well and reflect essentially two philosophies that can be used to design IT projects. Read on to find out more about how they affect team processes, which style we follow and why.
How a mono repository works
It is important to note that a mono repository does not mean a single-tiered software app at all. Some different microservices may well coexist in such a repository, which will use common code or libraries.
This approach is used by many global companies, e.g., Google, Yandex, and Microsoft. The latter have abandoned distributed repositories in favor of a single one several years ago.
Advantages of a mono repository are as follows:
- It is easier to manage dependencies in the code. Developers can create common components to be used by different product modules. This is helpful in controlling the amount of deployment packages.
- It is easier to identify errors and perform refactoring. Everything is quite obvious here as it is much faster to clean up one storage than ten.
- Team members interact more with each other. In a single repository, a common approach develops faster, and there is more motivation to collaborate and work together on the product.
- It is easier to immerse new developers into the project. The more repositories there are, the harder it is to understand the overall picture and the dependency between the product modules.
But there is a flipside, and they become more noticeable as the product grows. In a major project, the team will encounter performance issues, the repository becomes too slow to fetch or clone, standard git commands can take several seconds, and file searches become noticeably slower.
But the most unpleasant thing is the difficulties with CI. When multiple people work in the same repository, the delivery processes will be severely slowed down due to constant builds. In the most severe cases - to a complete halt, as shown by the author of one of the posts on the subject:
“Imagine your project's build time is 30 minutes. There are 100 developers working on the project, each of them fluffs 1 commit per day. And here is the question: ‘How many hours in a day does it take for the CI server [which only lets you merge changes into the master after they have been applied to the most recent commit] to merge changes from all developers? The correct answer is 50’.”
As a result, improperly configured CI will noticeably increase the development cost, increase Time-to-Market, and as a consequence, worsen the competitive position of the product. The company will overpay for unnecessary work, as the system will restore all services at each click. The multi repository helps solve these problems.
How the multi repository works
As we said before, this approach assumes that each microservice resides in its own separate repository. The small size of the services, separate versioning, and quick code checks save the team from performance issues during delivery.
Among the proponents of this approach are Amazon and Netflix. Here at True Engineering, we also have been moving in this direction for some time now, and here's why.
The benefits of a multi repository:
- High performance in large projects. This is the main argument in favor of multi repositories, so we will repeat it again. Amazon and Netflix, as mentioned above, use distributed approach to release updates more often, get feedback from users faster and strengthen their market position as a result.
- It's easier to test releases. You don't have to check every time to make sure everything works properly with other software. If something breaks, changes are very easy to roll back.
- It is easier to manage the life cycles of microservices. For the same reason, the fewer internal dependencies, the easier it is to pull out one component and add another.
- Ability to easily reuse components in other systems.
At True Engineering, there are no separate teams within a single product. We aim to build a Trunk Based Development model that assumes shared ownership of the code. In distributed repositories, shared logic is harder to isolate. With the purpose to allow developers to reuse code, we isolate their components into separate libraries that can be plugged into any service in any repository.
It is worth noting that even in the case of a mono repository, it is not possible to simply link code and reuse classes from one microservice in another. This leads to the fact that these services will need to be built and updated at the same time, so the pluggable library approach saves team effort here as well.
Shared libraries need to be managed carefully so as not to break public APIs that are already in use across multiple services. We solve this problem with library versioning and contract tests.
In addition, you get more versions of different components, and they need to be managed globally somehow. It becomes harder for developers to understand the overall picture, code checks may be less efficient than we want them to be. Several components can get into a release independently of each other, and we need to understand which tasks end up in the build. This is where Azure DevOps and our platform customizations over it come to our rescue. Our CI/CD is versioning our applications and in related tasks puts a tag with the name of the environment on which the change is deployed for that task, also indicating the version of the corresponding service when needed.
Bat the same time, we have to note that these difficulties can be overcome if teams have common processes, share common experiences and have a development platform. For example, we at True Engineering have been actively introducing microservice templates over the last year, and the multi repositories are helping us to industrialize them. The template has already included settings for the pipelines, build and deployment parameters, so all the developer has to do is develop the business logic and off you go.
However, mono repository performance problems will not be overcome by the right processes. When the product grows to a certain level, it will physically stop building quickly.
Our experience and findings
Ultimately, the choice of approach depends on what is more important to the team - ease of development or ease of delivery and use.
Developers may find it more difficult to work with multiple repositories, but businesses will benefit greatly from this approach. The low Time-to-Market, constant releases, and even advanced system monitoring that will show if performance has gone down after a release all add value to the product for the end user.
That is why we opt for multi repositories in our projects, and reduce possible difficulties in teams by common standards and platform processes. These are already mentioned common libraries, templates of microservices and projects, tests, etc.
However, currently there are only a few products among our solutions in separate repositories. This is partly due to the fact that many of our systems were created almost 10 years ago, when the world had very different ideas about development. Nowadays, releases are built with scripts while at that time the concept of continuous development was only being formed and the discussion about DevOps methodology had begun. After all, Docker appeared in 2013, and Kubernetes did in 2014.
Besides, let us remind you that with small projects it is really more convenient for the team to use a mono repository. And separating repositories in an already running product is a costly task.
That is why we are implementing a multi-approach in new products - there were about five of them this year, each with a dozen microservices. These projects take full advantage of our platform, and even independent teams work on the same processes that are often automated.