Going fast, by investing deep

Petr Janda
8 min readApr 4, 2022

Early-stage software startups have a hard job to do. With much less capital than their well-established competitors, they are running against time. Either iterate their way to product-market(-channel-model) fit or die.

From the very first day, the sense of urgency is real, and the team has to start shipping fast. Speed of delivery is everything. New features have to reach users as fast as possible, setting other concerns aside. There is no time to set a solid technology foundation in place. What’s the point of losing time with that, right? We don’t know what users want yet, it’s more important to ship features.

This approach is certainly one way to do it. I am just not sure if it’s the most efficient one. It feels like this to me:

With little to no foundation in place, the team risks taking incohesive directions, losing time reinventing the basics, or generating unnecessary amounts of technical debt. It feels like a missed opportunity to sit down and think twice about what would make the team the most productive in the first place.

Lessons from the later stage

I’ve interviewed dozens of technology leaders who run teams with hundreds of engineers, noticing an interesting pattern. When the team reaches between 80 and 100 engineers, they create a so-called Platform organization — a group of teams tasked to drive the technology ecosystem for other engineering teams. Over time these teams grow and could be around 20% of the technology organization.

The typical trigger for such investment is diminishing productivity. Adding new engineers doesn’t have the same impact as it used to anymore. Teams often reinvent the wheel, lacking a common foundation to build on. Platform teams are formulated to address that, and it usually takes months of hard work to implement meaningful organization-wide changes — they have to drive technology changes across dozens of teams.

But besides the technology itself, they have another challenge at hand: Culture has to shift too. Investments in technology platforms are new. They take time and budgets away from building the end-user product. The ROI is long-term and harder to measure. Platform teams and their leadership have to shift the status quo of what is valued and rewarded across the organization.

Not an easy job indeed.

And it makes me think: Do we have to wait till we have 80 engineers? Wouldn’t it be better to seed a different culture from the get-go?

Layers of a technology system

When I think about a healthy technology system that enables fast development of new functionality, I conceptually separate three essential layers:

  • The underlying Technology Platform — infrastructure, build systems, deployment automation, system architecture, API management, and other standard libraries that solve everyday problems that nearly any engineering team has.
  • Core System Domain — the heart of the software, that defines key system entities and how they tie together, modeling the real world into the software.
  • User Experiences — the end-user-facing products that deliver value to end-users.

These layers are fundamentally different. The amount of unknowns differs vastly too.

In the early days, startups don’t know what experience works the best to solve users’ needs. They will need to experiment with and iterate user experiences a lot. And fast.

In comparison, teams can make very opinionated choices very early regarding the Technology Platform that fits the type of the market and product they intend to build.

Lastly, the core system domain is a vital part of the system, where some experimentation will be needed. The quality of the model will often dictate how easy or hard it is for the team to build the features that users demand. It could be a real uphill battle if the features don’t “fit” the chosen model.

With the above in mind, we can formulate the following approach:

Start investing in the Technology Platform (very) early.
Invest time to find a solid core domain model.
Iterate fast on the user experience.

Regardless of if it’s 5, 50, or 500 people, investment in the Technology Platform has several positive effects.

It prevents teams from losing time by reinventing the same concepts twice.
Do you want to deploy code to the Kubernetes cluster? ✔️ there is a standard way for that. Do you want to set up new log monitoring? ✔️ there is a standard way for that.
Do you want to define and release a new API? ✔️ there is a standard way for that.
Do you want to send an event across systems? ✔️ there is a standard way for that.

Nearly any team will have all these problems. Over and over again.

Solving known problems that we know we will keep having over and over just once is a no-brainer. Solving them in an opinionated way is a massive productivity gain and investment well-worth the time.

It has a positive effect on engineers’ well-being too. Doing repetitive work and fighting your core systems is not fun. The concept of toil from the SRE book is real. Coding in a growing system without a strong structure is not fun either. Getting ahead of this early means we can get a lot done with a smaller, happier, and more productive team.

The team, supported by a sound technology platform and solid core system domain model, can spend more time shipping features for end-users. Their investments in the platform and domain make them fast iterating on user experience.

Early-stage technology platform in practice

Let’s dive deeper into a more specific example. It’s important to stress that the below choices are our choices; there are many alternatives you can use. The important part is choosing what system we want to build and defining what technology platform we will need from the get-go.

The Technology Platform defines what system you can build. It makes some things easy and some things hard. The right technology platform that fits your product will get you closer to the technology system you want to have.

In our case, we have chosen to build a system with the following properties:

  • It’s automated. Let’s face it; no one wants to do the boring stuff. Yes, you can leave a few manual actions here and there. But you can also choose not to tolerate any. We opt-in for the latter and pay the price.
  • It’s composed of small parts. We first build simple systems and then compose them into complex ones. We prefer a few smaller systems over one big one. Some people call this a micro-services approach.
  • It’s loosely coupled. With a few notable exceptions, we like to think about system design as if everything around us is on fire. We prefer asynchronous communication between systems, ensuring that the other system can do its job when ready, maybe a little later.
  • It’s opinionated. Suppose there are two good ways to do something. We pick one and stick with it. It helps us focus on other complex problems.
  • It’s optimized for productivity. Repetitive work is systematically phased out.

This dictates several choices we have made to make the above reality.

Heavy investments in infrastructure automation

We’ve heavily invested in infrastructure as code, which means everything from inviting colleagues to Github, complete deployment of Auth0, GCP networking, clusters, Kubernetes, Kafka, and storage is entirely automated. Anyone on the team can execute it from start to finish.

Opinionated micro-service architecture

Micro-services became a fashion in the tech industry, but we wanted to go an extra mile and create an ecosystem in which building micro-services is natural.

We define every interaction across systems in Protobuf and gRPC — we start from the protocol and generate client/server code. There is no other way to call across micro-services. We use grpc-web to bridge types from the deepest parts of the systems to the UI component.

All micro-services have a built-in async messaging. We believe being primarily asynchronous is an integral part of building micro-services — do your stuff and tell the world about it. Synchronous calls across services are infrequent; in lower single digit % of our service communication is sync.

To build a solid asynchronous backbone, we use Kafka with an opinionated configuration for low-volume high-risk scenarios where we block message publishing and wait for ACK, to high-volume, low-risk scenarios where we pump data to Kafka as fast as Kafka can take it. Publishing and consuming messages (typed by protobuf ofc) is easy.

Authentication and authorization are by the book. It just doesn’t make sense to innovate when it comes to security. All systems use JWT tokens with self-descriptive scopes and metadata to authorize requests without calling other systems.

There certainly is a lot more to building micro-services. The intention is not to solve all the potential problems up front. Instead, we focus on problems we know we have. We will expand our approach as we mature our product and overall technology system. The key is that we make explicit opinionated choices and codify them into core libraries and dev tools.

Early investment in tools and core libraries

We want to be equipped with the right developer tools to go fast. We buy many tools, but we have rolled up our sleeves and built our own tools to glue them all together into a cohesive experience.

If humans are good at something, it’s tool building. We build tools to make previously hard things easy.

This is no different in software.

Our tooling revolves around automating everyday tasks and the life-cycle of a micro-service. We bundle all the good stuff together to facilitate the development of a cohesive technology platform.

A new micro-service can be created very fast. This is what it takes:

This action generated a basic micro-service. It has default libraries for access to the database, management of migrations, authentication, authorization, Kafka management, API handlers management for both gRPC and HTTP, an internal job scheduler with a backoff retry with workers that can distribute to many pods, logging, and environment variables management. The generated codebase can be further committed to our mono-repo, and CICD will parse the configuration to find what exact steps to run and how to deploy to our Kubernetes cluster.

Our internal dev tools give us easy-to-use tooling with opinions and built-in best practices. Their codebase is lean and straightforward, yet they already do a lot of heavy lifting.

And so, in a couple of minutes, we have a running micro-service. This approach lives up to all fundamental system properties we wanted — it is an automated, opinionated, and productivity-optimized way to work towards a composable, loosely coupled system.

Path forward

Our current technology platform addresses some of our already repetitive problems. The cost was about 10 days of work so far and we are getting great value for this relatively small investment — we can spend more time building user experiences.

Even better, we are seeding a culture of productivity from the beginning. Throughout our journey, we want to think critically about where does our delivery speed come from. Sometimes it makes sense to ship more features, sometimes to take a step back and expand our technology platform first.

— —

Get in touch, if you like to geek out about engineering productivity and technology platforms. I am at petr@synq.to.

--

--

Petr Janda

Sharing experiences from scaling multiple startups over the past decade. Speaking about products, technology, go-to-market, or culture.