How to Build Microservices Without Creating a Distributed Monolith
Abstract
Microservices promise independent deployability, team autonomy, and better scalability. But many organizations end up trading a single monolith for a distributed monolith: a system composed of many services that are tightly coupled, deployed together, and brittle. This article is a comprehensive guide to designing, implementing, and operating microservice architectures that avoid the distributed-monolith anti-pattern. It covers history, theory, practical techniques, patterns, tooling, migration strategies, and a concrete worked example (Order / Inventory / Payment) with sample code and operational considerations.
Contents
- History and motivation
- What is a distributed monolith?
- Causes and common anti-patterns
- Theoretical foundations and principles
- Architecture and design patterns to avoid a distributed monolith
- Practical techniques and examples
- Domain boundaries and decomposition
- Data ownership strategies (DB-per-service, CDC, event sourcing)
- Interservice communication (sync vs async, pub/sub, idempotency)
- Transactional patterns: Sagas, compensation, distributed transactions
- Contracts, testing, and CI/CD
- Observability and resilience
- Deployment, platform, and team organization
- A worked example: Order / Inventory / Payment (sync vs async, saga)
- Migration strategy: Strangler, incremental extraction checklist
- Detecting a distributed monolith and common pitfalls
- Best-practice checklist
- Future directions and implications
- Conclusion
History and motivation
Microservices emerged in the 2010s as an evolution of service-oriented architecture (SOA) and the need to scale development and operations across many teams. Key drivers:
- Faster innovation and independent deployability
- Smaller codebases, clearer ownership
- Polyglot technologies and scaling specific components
- Cloud-native deployments and container orchestration (Docker + Kubernetes)
However, distributed systems are harder than monoliths. Many organizations learned the hard way: splitting a monolith into services without addressing coupling, data ownership, or organizational alignment created a system that is distributed but still monolithic in coupling and release cadence — the "distributed monolith".
What is a distributed monolith?
A distributed monolith is an architecture composed of multiple services that, in practice, behave like a single monolith due to tight coupling. Hallmarks:
- Services are deployed together or in lockstep
- Functional changes require coordinated releases across services
- Strong synchronous dependence (call chains) between services
- Shared database schemas, direct DB access from multiple services
- Shared libraries containing business logic used by many services
- High runtime coupling and cascading failures
A distributed monolith retains many of the operational and organizational drawbacks of a monolith while also suffering the complexity of distributed systems.
Causes and common anti-patterns
Why distributed monoliths happen:
- Poor domain decomposition (wrong boundaries)
- Shared database or schema coupling
- Chatty, synchronous APIs (long call chains)
- Over reliance on orchestration that centralizes logic and tightens coupling
- Extensive shared code and libraries with business logic
- Teams organized by technology rather than business capability (Conway’s Law)
- Insufficient test isolation and contract testing
- Lack of asynchronous decoupling (events) or poorly designed event flows
- Over-reliance on fragile distributed transactions (two-phase commit)
Theoretical foundations and guiding principles
- Domain-Driven Design (DDD): Use bounded contexts to align services with business capabilities (Eric Evans).
- Conway’s Law: Organization structure influences system architecture; match team boundaries to service boundaries.
- Fallacies of Distributed Computing: Assume unreliable networks and design for partial failure.
- CAP Theorem and ACID vs BASE: Accept trade-offs — strong consistency across distributed services is expensive.
- Single Responsibility & High Cohesion: Services should have a narrow, cohesive responsibility.
- Loose Coupling & Explicit Contracts: Minimize synchronous dependencies and define clear, versioned APIs.
Architecture and design patterns to avoid a distributed monolith
Key patterns and anti-patterns, with guidance:
- Bounded Contexts and Proper Decomposition
- Use domain modeling and event-storming to identify boundaries.
- Each microservice should represent a business capability with clear responsibility.
- Database-per-Service (with caveats)
- Prefer independent data stores or schemas per service to avoid coupling.
- Use CDC (change data capture) and event-driven replication when needed.
- Avoid direct cross-service DB queries.
- Asynchronous Communication & Event-Driven Architecture
- Prefer pub/sub or message brokers for decoupling and eventual consistency.
- Use events to notify other services of state changes.
- Sagas and Compensation for Transactions
- Replace distributed two-phase commits with saga choreography or orchestration.
- Use durable workflow engines (Temporal, Cadence, Conductor) for complex flows when necessary.
- Contract-First APIs and Consumer-Driven Contracts
- Use contract testing (Pact, Spring Cloud Contract) so producers and consumers can evolve independently.
- Observability & Distributed Tracing
- Instrument traces, logs, and metrics (OpenTelemetry, Jaeger, Prometheus).
- Resilience Patterns
- Timeouts, retries with exponential backoff, circuit breakers, bulkheads, rate limiting.
- Design for graceful degradation.
- Service Mesh and Sidecars — use carefully
- Service mesh (Istio, Linkerd) can enforce policies and telemetry; don't use it as a substitute for good architecture.
- Team and Organizational Structure
- Align teams to services (two-pizza teams) and provide platform capabilities for consistency.
Practical techniques and examples
Domain boundaries and decomposition
- Techniques:
- Event storming workshops
- Domain-Driven Design (identify bounded contexts)
- Value stream mapping (identify slow/critical flows)
- Outputs:
- Context maps with upstream/downstream relationships
- Service candidate list and responsibilities
- Heuristics:
- If two components often change together, consider grouping them.
- If one capability needs independent scaling, it’s a good candidate for a service.
Data management strategies
- Database-per-service:
- Pros: independence, performance optimization, schema evolution
- Cons: eventual consistency, data duplication
- Change Data Capture (CDC):
- Tools: Debezium + Kafka
- Use case: replicate data to other services or event streams when you cannot change producer.
- Event sourcing:
- Keep a sequence of events as the source of truth.
- Use for auditability and reconstructing state; introduces complexity.
- CQRS (Command Query Responsibility Segregation):
- Separate write (command) and read models; allows read optimization and decoupling.
Inter-service communication: sync vs async
- Synchronous (HTTP/REST, gRPC):
- Use for low-latency, request/response interactions where immediate consistency matters.
- Danger: chatty calls and cascades; add timeouts, retries, circuit breakers.
- Asynchronous (Kafka, RabbitMQ, NATS):
- Prefer for decoupling, higher resilience, and scalability.
- Enables eventual consistency.
- Hybrid: use sync for simple queries, async for state changes; consider BFFs for aggregations.
Idempotency, retries, and error-handling
- Design idempotent operations (idempotency keys).
- Implement well-defined retry policies and exponential backoff.
- Example idempotency header usage (Express.js pseudo-code):
```js // Express middleware to enforce idempotent requests via idempotency key const idempotencyStore = new Map(); // in production use Redis or DB
app.post('/payments', async (req, res) => { const idemKey = req.header('Idempotency-Key'); if (!idemKey) return res.status(400).send({ error: 'Idempotency-Key required' });
if (idempotencyStore.has(idemKey)) { return res.status(200).send(idempotencyStore.get(idemKey)); }
const result = await processPayment(req.body); // may throw idempotencyStore.set(idemKey, result); res.status(201).send(result); }); ```
Transactional patterns: sagas vs distributed transactions
- Two-phase commit (XA) is rarely a good fit for microservices.
- Sagas:
- Choreography: services publish and react to events. No central coordinator, but can create complex coupling.
- Orchestration: a central saga orchestrator coordinates steps; easier to reason but centralizes control.
- Use durable workflow systems (Temporal, Netflix Conductor, Camunda) for reliability.
Simple saga example (pseudo-code for orchestration):
``pseudo Orchestrator.start(orderPlacedEvent) { call PaymentService.charge(order); if success: call InventoryService.reserve(order); if success: call ShippingService.schedule(order); mark order complete else: call PaymentService.refund(order); mark order failed else: mark order failed } ``
Contracts, testing, and CI/CD
- Contract testing: ensures the producer and consumer agree on API schemas.
- Tools: Pact, Spring Cloud Contract
- Consumer-driven contract testing: consumers define expectations; providers verify.
- Testing Pyramid for microservices:
- Unit tests for service internals
- Contract tests for inter-service APIs
- Component/integration tests for service boundary
- End-to-end tests — use sparingly, on realistic environments...