Microservices Aren't the Goal: What we Check Before Splitting a Monolith

By OpsMatters

Jan 20, 2025

3 minutes

OpsMatters

Most “we should move to microservices” conversations start as architecture debates, but they’re almost always driven by operational pain. Releases feel fragile. Incidents take longer to diagnose. Scaling one busy area means scaling everything. Coordination costs grow faster than the product.

Over time, we’ve learned to treat microservices as a tool that you pick to remove a specific constraint, not as a badge of maturity. The most useful starting question is blunt: what outcome is the current architecture blocking today, and is distribution really the cheapest way to unlock it?

One of the clearest “sanity check” reminders I’ve heard recently came from Daniil Koshelev, a Senior Software Engineer & SRE who works on Go-based infrastructure and high-load systems: in a DevSecOps keynote he distilled the migration problem into a few operational truths, microservices only pay off when the monolith genuinely stops coping with growth, the system must be designed to survive partial failures, the real migration starts with data boundaries (not code), and weak boundaries turn the whole thing into a distributed monolith.

Start with the failure mode, not the architecture

When teams say “the monolith doesn’t scale,” they often mean different things. Sometimes it’s pure performance, but more often it’s deploy risk, unclear ownership, or coupled change. If you can’t name the constraint in practical terms, microservices won’t help; they’ll simply give you more moving parts to coordinate.

A useful mental shift is to assume that once you split services, you’ve made the network part of your product. Failures are normal, latency is variable, and a dependency being “mostly up” is still a problem. That’s why reliability patterns matter early. Microsoft’s guidance on the Circuit Breaker pattern is a good example of the mindset: you intentionally stop calling a failing dependency to prevent cascading failure and give the system room to recover. The pattern only works well when paired with disciplined timeouts and carefully scoped retries (especially with idempotency), rather than “retry until the incident page updates.”

Data boundaries decide whether you get autonomy or just new pain

If you want microservices to deliver independent delivery and fault isolation, you can’t treat data as an afterthought. Microsoft’s microservices data considerations page makes the coupling problem explicit: services can share the same physical database server, but trouble starts when they share the same schema/tables and read/write the same data structures. In practice, shared schemas pull you back into coordinated releases, make incidents harder to contain, and quietly erase the autonomy you were trying to buy.

This is also why “data sovereignty” shows up so often in modern microservices guidance: each service should own its domain data and logic, stay autonomous, and be independently deployable. If you can’t describe what a service truly owns, data included, then you’re not carving a boundary, you’re drawing a box on a diagram.

Observability isn’t an upgrade; it’s the price of distribution

In a monolith, debugging is often “find the logs, reproduce, fix.” In microservices, debugging is “reconstruct the story of a request across hops.” OpenTelemetry’s tracing concepts capture the core point simply: traces show the full path of a request through a system, monolith or mesh, and become essential once a single user action fans out across components.

This isn’t just a tooling trend; it reflects operational reality at scale. CNCF’s 2023 annual survey notes observability gets more challenging as container counts rise and highlights growing adoption of projects like Prometheus and OpenTelemetry. Standardization efforts matter here, too: Google’s note about OpenTelemetry tracing reaching a stable 1.0 milestone framed it as enabling teams to adopt standardized tracing with more confidence.

If your incident response currently struggles to answer “what happened?” quickly, adding services won’t fix that. It will multiply the number of places where the answer can hide.

Security scales with your endpoint count, whether you plan for it or not

Microservices usually mean more APIs, more identities, and more policy decisions. That expands the attack surface and increases the odds of inconsistency. The OWASP API Security Top 10 (2023) is a useful reality check here because it’s not theoretical; it’s a map of what breaks repeatedly in modern API-driven systems, starting with object-level authorization failures. The architectural takeaway is straightforward: if you’re going to multiply interfaces, you need a security posture that scales with them, not a collection of one-off implementations.

What I look for before I split anything

I try to keep my decision criteria boring. I want a clearly named constraint (deploy risk, ownership friction, isolated scaling), evidence that we can enforce data ownership boundaries rather than sharing schemas, a baseline reliability posture (timeouts, breakers, disciplined retries), and an observability setup that can actually explain cross-service behavior. When those pieces are missing, a modular monolith or a smaller step often buys more progress than a premature jump into distribution.

Microservices can be the right move. But when they work, it’s rarely because “microservices are modern.” It’s because the team treated distribution as an operational trade, paid for reliability and observability up front, and drew boundaries, especially data boundaries that stayed real under pressure.