In a monolithic system—where you likely have a single database—transaction handling can be straightforward: a single @Transactional
boundary often suffices to guarantee atomicity (all-or-nothing). However, once you split your application into multiple microservices, each with its own local database or data store, old assumptions break down. Atomic updates across services become much harder, and the standard “just wrap everything in a single ACID transaction” approach no longer fits.
This article takes a deeper look at why MSA changes transaction management, beginning with the CAP theorem and network partitions, exploring 2PC’s limitations, and ending with the concept of eventual consistency. Later parts will cover SAGA, Event Sourcing, CQRS, and Outbox more thoroughly.
1) CAP Theorem & Network Partitions
1.1) The CAP Theorem in a Nutshell
- CAP stands for Consistency, Availability, and Partition tolerance. A well-known result in distributed systems states you can only strongly guarantee two of these three at any given time.
- Why it matters: In a distributed environment (which MSA essentially is), you cannot both keep full ACID consistency (every node always up to date) and have high availability if you must also handle network partitions (disconnections, slow links).
1.2) Microservices & Network Partition Realities
- Network partitions are not just theoretical; any microservice can go offline or become unreachable if a link goes down.
- If you attempt a single, globally consistent transaction across multiple DBs, you effectively require all nodes to be online and in sync. When any node is partitioned, the entire transaction might stall or fail.
Conclusion so far: In an MSA, the monolithic assumption of “always-on, single DB” is gone. You face partial failures and network splits. This leads many teams to sacrifice immediate global consistency for availability and eventual data sync.
2) The Limitations of 2PC (Two-Phase Commit)
2.1) Classic XA or 2PC
- Older enterprise systems used 2PC to coordinate commits across multiple resources (e.g., DB + message queue). A transaction manager instructed all participants to “prepare,” and if everyone said “OK,” it then told them all to commit.
- This approach was somewhat feasible in a “big monolith on a single app server” with resources in the same environment.
2.2) Why 2PC Collides with MSA Principals
- Performance Bottlenecks:
- Each resource must hold locks or remain in “prepared” mode. The entire distributed transaction cannot finalize until the slowest participant is ready. Under MSA, you might have dozens of microservices, each with different performance or uptime constraints, making 2PC extremely slow or deadlock-prone.
- Complex Failure Handling:
- If the transaction coordinator fails mid-commit, the system can get stuck in an indeterminate state. For MSA, you want services to be independently deployable and resilient, but 2PC demands a single controlling entity that can block everything if it fails.
- Lack of Scalability:
- Microservices are typically loosely coupled, enabling teams to scale or update them independently. 2PC reintroduces a high degree of coupling.
Conclusion: 2PC is widely considered incompatible with microservices architectures that prize scalability, resilience, and agility.
3) Embracing “Eventual Consistency”
3.1) What Is Eventual Consistency?
- Rather than forcing all changes across services to commit simultaneously, each service commits locally (“ACID” within its own DB) and notifies others (e.g., via events or an orchestrator). Over time, each service’s data converges to a consistent state.
- Trade-off: At any given moment, some service’s data might be slightly outdated or missing a recent update. But after a short time, they align.
3.2) Local Transactions in Each Service
- While the global data might only be “eventually consistent,” each microservice still strictly uses
@Transactional
or local DB commits to remain consistent within its own boundary. - For example, OrderService never partially updates an order record; it either commits or rolls back. Meanwhile, other microservices (e.g., InventoryService, PaymentService) do the same for their respective DBs.
4) Distributed Transaction Alternatives: A Preview
4.1) SAGA Pattern
- Local transaction in each service: do your piece of the workflow.
- On failure, a compensation or “undo” transaction is triggered for previously successful steps.
- Choreography or Orchestration handles the overall flow.
- Upside: Lightweight, no central transaction manager.
- Downside: Complex “undo” logic for each step.
4.2) Event Sourcing
- Instead of storing just current states, each service logs “events” that describe changes.
- Other services subscribe to these events to update their own states.
- Perfect for full history tracking but can be quite complex to implement (event schema evolution, huge logs).
4.3) CQRS
- Splits “write commands” (which produce events) from “read queries” (which read from a specialized DB optimized for queries).
- Often used with Event Sourcing to handle large-scale read/write separation.
4.4) Outbox Pattern
- Ensures that local DB changes and event publishing to external services remain atomic.
- Writes an event record to an “outbox” table in the same local transaction, then a separate process publishes it to a message broker.
- Avoids the “DB commit success, but event publish fails” mismatch.
5) Why Microservices Necessitate a Shift in Thinking
- Decentralization:
- Each service is an independent domain; you can’t rely on a single global transaction spanned across multiple DBs.
- Network Instability:
- If you tried to do one giant 2PC, a small slowdown or partition in any microservice could block everyone else.
- Loosely Coupled Deployments:
- MSA encourages services to scale, update, or even reboot independently. “Locking them all in one synchronous commit” is the opposite of that philosophy.
So the shift from “strict ACID across the entire system” to “local ACID + eventual system-wide consistency” is almost inevitable when adopting MSA.
Summary
- MSA makes you confront network partition reality, so a single global transaction is not feasible or beneficial at scale.
- 2PC (XA) is typically too slow and too fragile for microservices.
- Therefore, each service invests in robust local transactions, plus an asynchronous or compensating mechanism for cross-service data alignment.
Moving forward, we’ll delve into the practical solutions that handle these distributed transaction challenges:
- SAGA: Using local commits + compensation steps.
- Event Sourcing & CQRS: Embracing event streams for state changes and read/write segregation.
- Outbox: Ensuring event publication and DB updates remain atomic in your local transactions.
This shift from monolithic ACID everywhere to local ACID + eventual consistency is fundamental to building large-scale microservices that stay agile, scalable, and resilient under real-world conditions.