Reactive Streams: Backpressure by Contract
Introduction: Why This Exists at All
Most services spend more time waiting than computing. Database calls, HTTP calls, queues, file I/O — that’s where your latency budget goes. In practice, reactive streams backpressure is how you stop that waiting from turning into thread inflation and unpredictable queues.
If your system spends most of its time waiting on I/O (databases, APIs, disks), burning one thread per request is wasteful. And if producers outrun consumers, queues balloon until your node topples over. Reactive Streams solves both with one idea: make demand explicit. Consumers ask for n items; producers must not send more than requested. That’s backpressure as a contract, not an afterthought.
This post is a practical tour of that contract and the small set of patterns that keep services stable under load. No frameworks required—just the model itself and idiomatic Reactor code.
The Contract in 60 Seconds
Reactive Streams defines four types: Publisher, Subscriber, Subscription, and (rarely used directly) Processor. Key rules:
- A
Subscribersignals demand viaSubscription.request(n). - A
Publishermust not emit more than requested. - Exactly one terminal signal:
onErrororonComplete. cancel()stops the flow and releases resources.
Reactor implements this model with two core types:
Flux<T>: 0…N elementsMono<T>: 0…1 element (a specialization ofPublisher)
The value of the contract is simple: no more surprise floods. If something downstream slows down, demand falls, and the flood subsides.
Cold vs. Hot: Don’t Confuse Recording with Broadcasting
Cold sequences run per subscriber (think: “play this recording for me”).
Hot sequences broadcast to all listeners (think: “what’s on right now”). You’ll write both.
Cold (per-subscriber)
Flux<Integer> numbers = Flux.range(1, 5); // new run for each subscriber
Hot (fan‑out to many) — use Sinks, not legacy Processors:
import reactor.core.publisher.Sinks;
Sinks.Many<String> bus = Sinks.many()
.multicast()
.onBackpressureBuffer(10_000, // bounded
msg -> log.warn("Dropped message: {}", msg),
Sinks.OverflowStrategy.DROP_OLDEST); // bounded + policy
// Publishing
bus.tryEmitNext("event-1");
// Subscribing (shared stream)
Flux<String> stream = bus.asFlux();
Reactive Streams Backpressure: Your Toolkit
Backpressure isn’t one operator. It’s a few levers you combine knowingly.
Batch upstream requests: limitRate
Reduce upstream request(n) chatter and pull in controlled batches.
Flux<Data> data = repo.stream(); // Publisher you don't control Flux<Data> batched = data.limitRate(256); // request 256, replenish as consumed
limitRate(highTide, lowTide) lets you fine‑tune replenishment (e.g., 1024 / 768).
Bound the buffer: onBackpressureBuffer(capacity, …, DROP_OLDEST|ERROR)
If downstream pauses, do not buffer unboundedly.
Flux<Event> safe =
fastProducer()
.onBackpressureBuffer(
1_000,
ev -> metrics.counter("overflow.dropped").increment(),
BufferOverflowStrategy.DROP_OLDEST);
Drop or sample when that’s OK: onBackpressureDrop, sample, throttleLast
Telemetry, mouse moves, rapidly changing dashboards often prefer newest over complete.
Flux<Metric> latest = metricsStream()
.onBackpressureDrop(m -> log.debug("drop {}", m))
.sample(Duration.ofMillis(250));
Control fan‑out work: flatMap with concurrency
Bound parallelism against external services.
Flux<Result> calls = ids.flatMap(
id -> api.fetch(id), // returns Mono<Result>
32 // at most 32 concurrent in-flight calls
);
Draw a boundary for blocking: publishOn(boundedElastic())
If you must call blocking code, isolate it. Don’t poison the event loop.
Flux<Response> safeIo =
incoming
.publishOn(Schedulers.boundedElastic())
.map(id -> blockingDao.load(id)); // safely off event-loop
Reactor is explicit about backpressure and non‑blocking; the reference guide covers these levers in detail.
A Minimal End‑to‑End Example (Cold → Hot, Bounded, Observable)
The goal: convert a cold source to a hot bus, enforce bounds, and surface metrics.
import reactor.core.publisher.Flux;
import reactor.core.publisher.Sinks;
import reactor.util.concurrent.Queues;
class Notifications {
private final Sinks.Many<Notification> sink = Sinks.many()
.multicast()
.onBackpressureBuffer(
10_000,
n -> log.warn("drop oldest: {}", n),
Sinks.OverflowStrategy.DROP_OLDEST);
Flux<Notification> stream() {
return sink.asFlux()
.limitRate(Queues.SMALL_BUFFER_SIZE) // batch upstream requests
.doOnCancel(() -> log.debug("client disconnected"))
.doOnSubscribe(s -> log.debug("client connected"));
}
void publish(Notification n) {
Sinks.EmitResult r = sink.tryEmitNext(n);
if (!r.isSuccess()) {
log.warn("emit failed: {}", r);
metrics.counter("emit.fail", "reason", r.name()).increment();
}
}
}
- Sinks + bounded buffer: no unbounded queues.
limitRate: polite upstream pull.tryEmitNext: explicit error handling for back‑pressure conditions.
What About Testing? Make Demand Visible
Use StepVerifier to assert both values and demand choreography.
import reactor.test.StepVerifier;
StepVerifier.create(source.limitRate(4))
.thenRequest(2) // request 2
.expectNext(A, B)
.thenRequest(2) // request 2 more
.expectNext(C, D)
.thenCancel()
.verify();
And in CI, enable BlockHound to catch accidental blocking on the event loop.
@BeforeAll
static void installBlockHound() {
reactor.blockhound.BlockHound.install(); // dev/test only
}
(BlockHound is a small agent that flags blocking calls inside non‑blocking threads.)
Hot vs. Replay: “Did I Miss Anything?”
If late subscribers must see recent items, use a replay sink—bounded.
Sinks.Many<Event> replay = Sinks.many().replay().limit(100); // publish: replay.tryEmitNext(event); Flux<Event> withHistory = replay.asFlux();
This replays the last 100 events to newcomers without going unbounded. Documentation and issues around replay trade‑offs are public; keep history tight and data small.
JDK 9 Flow Interop (When You Need It)
Reactive Streams and JDK 9 Flow are conceptually aligned. Reactor ships adapters so you can bridge Flow.Publisher ↔︎ Publisher cleanly:
import reactor.adapter.JdkFlowAdapter; import java.util.concurrent.Flow; Flow.Publisher<T> jdkPub = JdkFlowAdapter.publisherToFlowPublisher(reactorPublisher);
Handy when a library exposes Flow.* types.
Production Checklist (Copy/Paste)
- Bound everything: buffers (
onBackpressureBuffer(capacity, …, policy)), replay (.limit(n)), concurrency (flatMap(..., concurrency)), batch size (limitRate). - Isolate blocking:
publishOn(boundedElastic()); never block the event loop. - Hot pipes: use Sinks; avoid legacy processors.
- Observability: sprinkle
doOn*andlog(); add request/latency metrics per stage. - Load test demand: StepVerifier for choreography; run synthetic subscribers at scale.
- Failure policy: pick drop vs buffer intentionally per stream; document why.
Common Pitfalls (and the Fix)
- Unbounded queues → Fix: always set buffer caps; prefer DROP_OLDEST for live feeds.
- Accidental blocking → Fix: push adapters (JDBC, SDKs) behind
boundedElasticor migrate to async drivers. - Too many tiny requests → Fix:
limitRatewith reasonable high/low tide to reduce upstream chatter. - Fan‑out storms → Fix: cap
flatMapconcurrency and consider backpressure‑aware batching.
Closing: The Payoff
Reactive Streams doesn’t promise magic throughput. It promises predictable pressure. With demand made explicit, you trade mystery outages for visible, tunable limits. In practice that means fewer runaway queues, fewer GC spikes, and saner failure modes—the things that actually keep a service up.

Enjoyed this article? Take the next step.
Future-Proof Your Java Career With Spring AI
The age of AI is here, but your Java & Spring experience isn’t obsolete—it’s your greatest asset.
This is the definitive guide for enterprise developers to stop being just coders and become the AI Orchestrators of the future.