Solving the Double-Booking Problem: Distributed Locking, Idempotency, and Redis

The Problem: Two People, One Seat

Imagine 500 people trying to book the same concert seat at the same moment. Without proper concurrency control, multiple users can pass the "is this seat available?" check before any of them actually reserve it. The result: duplicate bookings, double charges, and a customer support nightmare.

This isn't a theoretical concern. I've seen race conditions in production booking systems that caused exactly this. So I built a ticket booking system from scratch to explore three different approaches to solving it — from intentionally broken to production-ready.

Three Approaches, One Problem

The system implements three booking strategies side by side, so you can see exactly where each one breaks or holds:

Approach	Mechanism	Correctness	Avg Response	Scales Horizontally?
Naive	No locking	Broken	~100ms	N/A
Pessimistic	`SELECT FOR UPDATE`	Correct	~500ms	No
Distributed	Redis `SET NX`	Correct	~150ms	Yes

Let's walk through each one.

The Naive Approach: How Race Conditions Happen

The naive implementation checks if a seat is available, then reserves it. The problem is the gap between check and update:

Request A: Checks seat → Available
Request B: Checks seat → Available    (A hasn't written yet!)
Request C: Checks seat → Available    (Neither A nor B have written!)
Request A: Updates seat → Reserved
Request B: Updates seat → Reserved    (Overwrites A!)
Request C: Updates seat → Reserved    (Overwrites B!)

Three users all get confirmation. One seat exists. The test script fires 10 concurrent requests at the same seat and consistently produces race conditions:

RACE CONDITION DETECTED!
3 users successfully "reserved" the SAME seat!

This is the baseline — the thing every other approach exists to prevent.

Pessimistic Locking: Let the Database Handle It

The first correct approach uses PostgreSQL's SELECT ... FOR UPDATE. When a transaction locks a row, every other transaction trying to lock the same row blocks until the first one commits or rolls back:

SELECT * FROM "seats"
WHERE "id" = $1
AND "eventId" = $2
FOR UPDATE

The timeline becomes sequential:

Request A: SELECT FOR UPDATE → Gets lock, proceeds
Request B: SELECT FOR UPDATE → BLOCKS (waiting for A)
Request A: UPDATE, COMMIT → Releases lock
Request B: Gets lock, checks status → ALREADY RESERVED → Returns error

One critical detail: lock ordering. If User A locks seats [1, 2] and User B locks seats [2, 1], they deadlock. The fix is sorting seat IDs before acquiring locks:

const sortedSeatIds = [...seatIds].sort();

This works, but it has a fundamental limitation. Every request queues behind the previous one. Under load, response times climb because requests are blocking, not failing fast. And since the locks live in PostgreSQL, you can't scale horizontally across multiple database replicas.

Distributed Locking with Redis: The Production Approach

Redis distributed locks solve both problems. They're non-blocking (fail fast if the lock is taken) and work across multiple application servers.

Acquiring a Lock

The core primitive is Redis SET key value NX PX ttl — set a key only if it doesn't exist, with a millisecond expiry:

export async function acquireLock(
  key: string,
  options: LockOptions = {}
): Promise<AcquiredLock | null> {
  const { ttlSeconds = 10, retryCount = 0, retryDelayMs = 100 } = options;
  const value = uuidv4(); // Unique owner identifier
  const ttlMs = ttlSeconds * 1000;
 
  for (let attempt = 0; attempt <= retryCount; attempt++) {
    const result = await redis.set(key, value, 'PX', ttlMs, 'NX');
    if (result === 'OK') {
      return { key, value }; // Lock acquired
    }
    if (attempt < retryCount) {
      await sleep(retryDelayMs);
    }
  }
  return null; // Lock not acquired
}

The value is a UUID — the lock owner's identity. This matters when releasing.

Releasing a Lock Safely

You can't just DEL the key. Consider this scenario:

Process A acquires lock with value "A123"
Process A takes too long — lock expires via TTL
Process B acquires lock with value "B456"
Process A finishes and calls DEL
Process A just deleted Process B's lock

The fix is a Lua script that atomically checks ownership before deleting:

if redis.call('get', KEYS[1]) == ARGV[1] then
  return redis.call('del', KEYS[1])
else
  return 0
end

This runs atomically on the Redis server — no race condition between the check and the delete.

Multi-Seat Locking

When a user reserves multiple seats, all locks must be acquired atomically. If any lock fails, all previously acquired locks are released:

export async function acquireMultipleLocks(
  keys: string[],
  options: LockOptions = {}
): Promise<AcquiredLock[] | null> {
  const sortedKeys = [...keys].sort(); // Prevent deadlocks
  const acquired: AcquiredLock[] = [];
 
  for (const key of sortedKeys) {
    const lock = await acquireLock(key, options);
    if (!lock) {
      // Release all previously acquired locks
      await Promise.all(
        acquired.map(l => releaseLock(l.key, l.value))
      );
      return null;
    }
    acquired.push(lock);
  }
  return acquired;
}

Same deadlock prevention as the pessimistic approach — sort the keys first.

The Full Reservation Flow

The distributed booking service combines Redis locks with database transactions for defense in depth:

Acquire distributed locks for all requested seats
Verify availability inside a database transaction (the lock prevents races, but the DB is the source of truth)
Update seat status to RESERVED with a version increment
Schedule cleanup via BullMQ (in case the user abandons the reservation)
Release locks in a finally block — always, even on error

const SEAT_LOCK_OPTIONS: LockOptions = {
  ttlSeconds: 30,
  retryCount: 3,
  retryDelayMs: 100,
};

The lock TTL is 30 seconds — long enough for the database transaction, short enough to recover from a crashed process.

Idempotency: Handling Duplicate Requests

Network failures cause retries. A user's browser might send the same booking request twice. Without idempotency, you'd create two bookings and charge twice.

The solution: the client generates a unique idempotencyKey per booking attempt and sends it with every retry of that same attempt:

// Client generates once, sends on every retry
const idempotencyKey = crypto.randomUUID();
 
POST /api/bookings
{
  "reservationIds": ["..."],
  "idempotencyKey": "abc-123"
}

On the server, the check happens before acquiring any locks:

const existingBooking = await prisma.booking.findUnique({
  where: { idempotencyKey },
});
 
if (existingBooking) {
  return existingBooking; // Safe retry — no duplicate created
}

The idempotencyKey column has a unique index, so even if two identical requests race past the check, the database constraint prevents a duplicate insert. The second request gets a conflict error and can retry, hitting the findUnique path.

Reservation Expiry: Cleaning Up Abandoned Seats

When a user reserves a seat but never completes payment, the seat needs to go back to the pool. Two mechanisms handle this:

Scheduled Jobs (Primary)

When a reservation is created, a BullMQ job is scheduled to fire at the expiry time (default: 10 minutes):

export async function scheduleReservationCleanup(
  data: ReservationCleanupJobData
): Promise<Job> {
  const delay = Math.max(0, expiresAt.getTime() - Date.now());
 
  return reservationCleanupQueue.add('cleanup', data, {
    delay,
    jobId: `cleanup:${data.reservationId}`, // Prevents duplicate jobs
  });
}

If the user completes the booking, the cleanup job is cancelled before it fires.

Periodic Sweep (Fallback)

If the worker was down when a job was scheduled, or Redis lost the job, a periodic sweep catches anything that slipped through:

// Runs every 60 seconds
const expiredReservations = await prisma.reservation.findMany({
  where: {
    status: 'ACTIVE',
    expiresAt: { lt: new Date() },
  },
  take: 100, // Limit per run
});

Both cleanup paths acquire a distributed lock before modifying the seat and double-check the reservation status — the seat might have been confirmed while waiting for the lock.

Compensating Transactions: When Payment Succeeds but Booking Fails

In distributed systems, you can't always wrap everything in a single ACID transaction. What if Stripe charges the card but the database update fails?

The system handles this with compensating transactions:

1. Create booking with status PENDING
2. Create Stripe PaymentIntent
3. User pays → Stripe webhook fires
4. Update booking to CONFIRMED
   └── If this fails:
       → Refund payment automatically
       → Release reserved seats
       → Mark booking as FAILED
       → Create audit log entry

The customer is never charged for a failed booking. The audit log captures the full timeline for debugging.

The Data Model

The schema separates the booking lifecycle into distinct entities:

Seats have a version field for optimistic locking and a status enum (AVAILABLE, RESERVED, BOOKED, BLOCKED)
Reservations are temporary holds with an expiresAt timestamp
Bookings are confirmed purchases with an idempotencyKey (unique index) and payment tracking
AuditLog records every state transition with before/after values

model Seat {
  status     SeatStatus @default(AVAILABLE)
  version    Int        @default(0)
  reservedBy    String?
  reservedUntil DateTime?
 
  @@index([eventId, status])
  @@index([reservedUntil])
}
 
model Booking {
  idempotencyKey String @unique
  status         BookingStatus
  paymentStatus  PaymentStatus
 
  @@index([idempotencyKey])
}

The @@index([reservedUntil]) index is specifically for the periodic cleanup query — without it, the sweep would table-scan on every run.

npm run test:race          # Watch the naive approach break
npm run test:distributed   # 5, 20, 50 concurrent users — exactly 1 wins

Solving the Double-Booking Problem: Distributed Locking, Idempotency, and Redis

The Problem: Two People, One Seat

Three Approaches, One Problem

The Naive Approach: How Race Conditions Happen

Pessimistic Locking: Let the Database Handle It

Distributed Locking with Redis: The Production Approach

Acquiring a Lock

Releasing a Lock Safely

Multi-Seat Locking

The Full Reservation Flow

Idempotency: Handling Duplicate Requests

Reservation Expiry: Cleaning Up Abandoned Seats

Scheduled Jobs (Primary)

Periodic Sweep (Fallback)

Compensating Transactions: When Payment Succeeds but Booking Fails

The Data Model

What I Learned

Fail fast beats blocking

Defense in depth isn't optional

Idempotency keys belong on the client

Background cleanup needs a fallback

Lua scripts are essential for Redis correctness

Try It Out