Solving the Double-Booking Problem: Distributed Locking, Idempotency, and Redis
How I built a ticket booking system that handles concurrent seat reservations using Redis distributed locks, PostgreSQL pessimistic locking, idempotency keys, and compensating transactions.
The Problem: Two People, One Seat
Imagine 500 people trying to book the same concert seat at the same moment. Without proper concurrency control, multiple users can pass the "is this seat available?" check before any of them actually reserve it. The result: duplicate bookings, double charges, and a customer support nightmare.
This isn't a theoretical concern. I've seen race conditions in production booking systems that caused exactly this. So I built a ticket booking system from scratch to explore three different approaches to solving it — from intentionally broken to production-ready.
Three Approaches, One Problem
The system implements three booking strategies side by side, so you can see exactly where each one breaks or holds:
| Approach | Mechanism | Correctness | Avg Response | Scales Horizontally? |
|---|---|---|---|---|
| Naive | No locking | Broken | ~100ms | N/A |
| Pessimistic | SELECT FOR UPDATE | Correct | ~500ms | No |
| Distributed | Redis SET NX | Correct | ~150ms | Yes |
Let's walk through each one.
The Naive Approach: How Race Conditions Happen
The naive implementation checks if a seat is available, then reserves it. The problem is the gap between check and update:
Request A: Checks seat → Available
Request B: Checks seat → Available (A hasn't written yet!)
Request C: Checks seat → Available (Neither A nor B have written!)
Request A: Updates seat → Reserved
Request B: Updates seat → Reserved (Overwrites A!)
Request C: Updates seat → Reserved (Overwrites B!)
Three users all get confirmation. One seat exists. The test script fires 10 concurrent requests at the same seat and consistently produces race conditions:
RACE CONDITION DETECTED!
3 users successfully "reserved" the SAME seat!
This is the baseline — the thing every other approach exists to prevent.
Pessimistic Locking: Let the Database Handle It
The first correct approach uses PostgreSQL's SELECT ... FOR UPDATE. When a transaction locks a row, every other transaction trying to lock the same row blocks until the first one commits or rolls back:
SELECT * FROM "seats"
WHERE "id" = $1
AND "eventId" = $2
FOR UPDATEThe timeline becomes sequential:
Request A: SELECT FOR UPDATE → Gets lock, proceeds
Request B: SELECT FOR UPDATE → BLOCKS (waiting for A)
Request A: UPDATE, COMMIT → Releases lock
Request B: Gets lock, checks status → ALREADY RESERVED → Returns error
One critical detail: lock ordering. If User A locks seats [1, 2] and User B locks seats [2, 1], they deadlock. The fix is sorting seat IDs before acquiring locks:
const sortedSeatIds = [...seatIds].sort();This works, but it has a fundamental limitation. Every request queues behind the previous one. Under load, response times climb because requests are blocking, not failing fast. And since the locks live in PostgreSQL, you can't scale horizontally across multiple database replicas.
Distributed Locking with Redis: The Production Approach
Redis distributed locks solve both problems. They're non-blocking (fail fast if the lock is taken) and work across multiple application servers.
Acquiring a Lock
The core primitive is Redis SET key value NX PX ttl — set a key only if it doesn't exist, with a millisecond expiry:
export async function acquireLock(
key: string,
options: LockOptions = {}
): Promise<AcquiredLock | null> {
const { ttlSeconds = 10, retryCount = 0, retryDelayMs = 100 } = options;
const value = uuidv4(); // Unique owner identifier
const ttlMs = ttlSeconds * 1000;
for (let attempt = 0; attempt <= retryCount; attempt++) {
const result = await redis.set(key, value, 'PX', ttlMs, 'NX');
if (result === 'OK') {
return { key, value }; // Lock acquired
}
if (attempt < retryCount) {
await sleep(retryDelayMs);
}
}
return null; // Lock not acquired
}The value is a UUID — the lock owner's identity. This matters when releasing.
Releasing a Lock Safely
You can't just DEL the key. Consider this scenario:
- Process A acquires lock with value
"A123" - Process A takes too long — lock expires via TTL
- Process B acquires lock with value
"B456" - Process A finishes and calls
DEL - Process A just deleted Process B's lock
The fix is a Lua script that atomically checks ownership before deleting:
if redis.call('get', KEYS[1]) == ARGV[1] then
return redis.call('del', KEYS[1])
else
return 0
endThis runs atomically on the Redis server — no race condition between the check and the delete.
Multi-Seat Locking
When a user reserves multiple seats, all locks must be acquired atomically. If any lock fails, all previously acquired locks are released:
export async function acquireMultipleLocks(
keys: string[],
options: LockOptions = {}
): Promise<AcquiredLock[] | null> {
const sortedKeys = [...keys].sort(); // Prevent deadlocks
const acquired: AcquiredLock[] = [];
for (const key of sortedKeys) {
const lock = await acquireLock(key, options);
if (!lock) {
// Release all previously acquired locks
await Promise.all(
acquired.map(l => releaseLock(l.key, l.value))
);
return null;
}
acquired.push(lock);
}
return acquired;
}Same deadlock prevention as the pessimistic approach — sort the keys first.
The Full Reservation Flow
The distributed booking service combines Redis locks with database transactions for defense in depth:
- Acquire distributed locks for all requested seats
- Verify availability inside a database transaction (the lock prevents races, but the DB is the source of truth)
- Update seat status to
RESERVEDwith a version increment - Schedule cleanup via BullMQ (in case the user abandons the reservation)
- Release locks in a
finallyblock — always, even on error
const SEAT_LOCK_OPTIONS: LockOptions = {
ttlSeconds: 30,
retryCount: 3,
retryDelayMs: 100,
};The lock TTL is 30 seconds — long enough for the database transaction, short enough to recover from a crashed process.
Idempotency: Handling Duplicate Requests
Network failures cause retries. A user's browser might send the same booking request twice. Without idempotency, you'd create two bookings and charge twice.
The solution: the client generates a unique idempotencyKey per booking attempt and sends it with every retry of that same attempt:
// Client generates once, sends on every retry
const idempotencyKey = crypto.randomUUID();
POST /api/bookings
{
"reservationIds": ["..."],
"idempotencyKey": "abc-123"
}On the server, the check happens before acquiring any locks:
const existingBooking = await prisma.booking.findUnique({
where: { idempotencyKey },
});
if (existingBooking) {
return existingBooking; // Safe retry — no duplicate created
}The idempotencyKey column has a unique index, so even if two identical requests race past the check, the database constraint prevents a duplicate insert. The second request gets a conflict error and can retry, hitting the findUnique path.
Reservation Expiry: Cleaning Up Abandoned Seats
When a user reserves a seat but never completes payment, the seat needs to go back to the pool. Two mechanisms handle this:
Scheduled Jobs (Primary)
When a reservation is created, a BullMQ job is scheduled to fire at the expiry time (default: 10 minutes):
export async function scheduleReservationCleanup(
data: ReservationCleanupJobData
): Promise<Job> {
const delay = Math.max(0, expiresAt.getTime() - Date.now());
return reservationCleanupQueue.add('cleanup', data, {
delay,
jobId: `cleanup:${data.reservationId}`, // Prevents duplicate jobs
});
}If the user completes the booking, the cleanup job is cancelled before it fires.
Periodic Sweep (Fallback)
If the worker was down when a job was scheduled, or Redis lost the job, a periodic sweep catches anything that slipped through:
// Runs every 60 seconds
const expiredReservations = await prisma.reservation.findMany({
where: {
status: 'ACTIVE',
expiresAt: { lt: new Date() },
},
take: 100, // Limit per run
});Both cleanup paths acquire a distributed lock before modifying the seat and double-check the reservation status — the seat might have been confirmed while waiting for the lock.
Compensating Transactions: When Payment Succeeds but Booking Fails
In distributed systems, you can't always wrap everything in a single ACID transaction. What if Stripe charges the card but the database update fails?
The system handles this with compensating transactions:
1. Create booking with status PENDING
2. Create Stripe PaymentIntent
3. User pays → Stripe webhook fires
4. Update booking to CONFIRMED
└── If this fails:
→ Refund payment automatically
→ Release reserved seats
→ Mark booking as FAILED
→ Create audit log entry
The customer is never charged for a failed booking. The audit log captures the full timeline for debugging.
The Data Model
The schema separates the booking lifecycle into distinct entities:
- Seats have a
versionfield for optimistic locking and astatusenum (AVAILABLE,RESERVED,BOOKED,BLOCKED) - Reservations are temporary holds with an
expiresAttimestamp - Bookings are confirmed purchases with an
idempotencyKey(unique index) and payment tracking - AuditLog records every state transition with before/after values
model Seat {
status SeatStatus @default(AVAILABLE)
version Int @default(0)
reservedBy String?
reservedUntil DateTime?
@@index([eventId, status])
@@index([reservedUntil])
}
model Booking {
idempotencyKey String @unique
status BookingStatus
paymentStatus PaymentStatus
@@index([idempotencyKey])
}The @@index([reservedUntil]) index is specifically for the periodic cleanup query — without it, the sweep would table-scan on every run.
What I Learned
Fail fast beats blocking
The distributed approach returns errors in ~150ms when a seat is taken. The pessimistic approach blocks for ~500ms waiting for the lock. Users get faster feedback, and the system handles higher throughput.
Defense in depth isn't optional
Redis locks prevent race conditions, but Redis can fail. The database transaction inside the lock provides a second layer of protection. Belt and suspenders.
Idempotency keys belong on the client
The server can't generate idempotency keys — it doesn't know if two requests are retries of the same operation or two intentional operations. The client generates the key once and reuses it across retries.
Background cleanup needs a fallback
Scheduled jobs handle the happy path. But workers go down, Redis loses data, jobs get stuck. The periodic sweep is a safety net that catches everything the scheduled jobs miss.
Lua scripts are essential for Redis correctness
Any Redis operation that needs to check-then-act must use a Lua script. Without atomicity, you get the same race conditions you're trying to prevent.
Try It Out
The full source code is on GitHub: jay-1799/seat-masters
Clone the repo, run docker-compose up -d to start PostgreSQL and Redis, then npm run dev for the app and npm run workers for background processing. The test scripts demonstrate the race conditions and prove the distributed locking works:
npm run test:race # Watch the naive approach break
npm run test:distributed # 5, 20, 50 concurrent users — exactly 1 wins