VibeWeek
Home/Grow/Idempotency Patterns: Build Endpoints That Can Safely Be Called Twice

Idempotency Patterns: Build Endpoints That Can Safely Be Called Twice

⬅️ Day 6: Grow Overview

If you're running a SaaS in 2026, every API endpoint that mutates state needs to handle the case where it gets called twice with the same intent. Network blips retry; webhooks fire twice; users double-click "submit"; queues redeliver. Most founders ignore this until a customer reports being charged twice for the same purchase, or seeing duplicate orders, or an email going out 3 times. Then it becomes a Sev-1 incident at 11pm.

A working idempotency strategy answers: which endpoints need it, what's the idempotency key, where do we store the deduplication state, what's the TTL, and how do we handle conflicts. Done well, retries are safe; double-clicks don't double-charge; webhooks can replay without consequence. Done badly, your support team explains "yes, you were charged twice; we're fixing it manually" once a week.

This guide is the implementation playbook for idempotent operations — when to use, key design, storage patterns, edge cases (different request body with same key), and the discipline that prevents the most painful class of production bugs. Companion to Outbound Webhooks, Cron Jobs & Scheduled Tasks, and Public API.

Why Idempotency Matters

Help me understand idempotency.

The definition:

An idempotent operation produces the same result whether called once or N times.

For HTTP APIs:
- GET / HEAD: idempotent (safe to retry; no state change)
- PUT / DELETE: idempotent in spec
- POST / PATCH: NOT idempotent in spec; need explicit handling
- POST creates often need idempotency keys

**The "double-call" sources**:

**1. Network retry**: client sends; gets timeout; doesn''t know if server got it; retries
**2. User double-click**: nothing happens visibly; user clicks again
**3. Webhook replay**: providers (Stripe, etc.) retry up to 3 days
**4. Queue redelivery**: at-least-once queues redeliver on consumer failure
**5. Browser back-button**: user submits; goes back; submits again
**6. Mobile app cache**: queues offline; sends queued request

**Cost of non-idempotent**:

- Charge: customer charged twice
- Email: customer receives N emails
- Order: duplicate order created
- Notification: spammed
- Vote: counted twice

For each: support ticket; refund; manual fix; angry customer.

**The fix is cheap**:

Idempotency keys add ~5 lines of code per endpoint. Cheap insurance.

For my system:
- Mutating endpoints inventory
- Current idempotency state
- Recent duplicate-related incidents

Output:
1. The mutating-endpoints inventory
2. The "is this idempotent?" audit
3. The risk priority

The biggest unforced error: assuming network calls happen once. Production networks are lossy; clients retry; queues redeliver. Idempotency is mandatory infrastructure, not optional.

When You Need Idempotency Keys

Help me identify where idempotency keys belong.

**High-risk (always need keys)**:

- Payment / charge / refund
- Email / notification sends
- Order / record creation
- Webhook receivers
- Queue consumers

**Medium-risk (often need keys)**:

- Profile / settings updates
- Counter increments
- Status changes
- Permission grants

**Low-risk (often don''t)**:

- Read-only operations (GET / HEAD)
- Idempotent-by-nature (PUT with full state; SET-style)
- Stateless transformations

**The "natural idempotency" pattern**:

```sql
-- Idempotent (UPSERT)
INSERT INTO users (email, name)
VALUES ('bob@example.com', 'Bob')
ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name;

-- Run twice; same result

vs.

-- Not idempotent (creates duplicates)
INSERT INTO orders (user_id, amount) VALUES ($1, $2);

-- Run twice; two orders

For my system:

  • Mutating-endpoints classification
  • Current idempotency coverage
  • Priority gaps

Output:

  1. The endpoint classification
  2. The idempotency-needed list
  3. The implementation priority

The biggest classification mistake: **assuming "internal API" doesn''t need idempotency.** Internal services retry too; background jobs redeliver too. Classify by side-effect, not audience.

## The Idempotency Key Design

Help me design idempotency keys.

The requirements:

1. Unique per intent: same key = same intent 2. Generated by client: server can''t generate (too late) 3. Sufficient entropy: UUID v4 standard 4. Not the resource ID: chicken-and-egg 5. Bounded TTL: 24h-7d typical

The key sources:

Source Pattern
HTTP API client Idempotency-Key header (UUID)
Webhook receiver Provider event ID (Stripe evt_xxx)
Queue consumer Message ID + content hash
Form submission Hidden field with form-instance UUID
Mobile app Per-action UUID stored locally

The Stripe convention:

POST /v1/charges
Idempotency-Key: 8e3d1f5a-...

amount=1000&currency=usd&...

Stripe stores response for 24 hours. Subsequent calls with same key return original.

The webhook-receiver convention:

async function handleStripeWebhook(event) {
  if (await alreadyProcessed(event.id)) return;
  await processEvent(event);
  await markProcessed(event.id);
}

The queue-consumer convention:

async function processJob(message) {
  if (await alreadyProcessed(message.id)) {
    await message.ack();
    return;
  }
  await doWork(message);
  await markProcessed(message.id);
  await message.ack();
}

The form-submission convention:

<form>
  <input type="hidden" name="idempotency_key" value="${uuid()}">
  ...
</form>

For my system:

  • Key sources for each entry point
  • TTL strategy
  • Convention documentation

Output:

  1. The key-source matrix
  2. The TTL policy
  3. The header / field conventions

The biggest key-design mistake: **server-generated keys.** Server creates UUID upon receipt; can''t deduplicate retries (each retry gets a new UUID). The fix: client generates; server uses.

## The Storage Layer

Help me pick the storage layer.

Option A: Redis with TTL (most common)

async function withIdempotency(key, fn) {
  const acquired = await redis.set(`idem:${key}`, 'pending', 'NX', 'EX', 86400);
  if (!acquired) {
    const value = await redis.get(`idem:${key}`);
    if (value === 'pending') throw new Error('In progress');
    return JSON.parse(value);
  }
  try {
    const result = await fn();
    await redis.set(`idem:${key}`, JSON.stringify(result), 'EX', 86400);
    return result;
  } catch (e) {
    await redis.del(`idem:${key}`);  // allow retry
    throw e;
  }
}

Best for: most use cases.

Option B: Postgres table with TTL

CREATE TABLE idempotency_keys (
  key TEXT PRIMARY KEY,
  user_id UUID,
  status TEXT,
  response JSONB,
  created_at TIMESTAMPTZ DEFAULT now(),
  expires_at TIMESTAMPTZ NOT NULL
);
const result = await db.query(`
  INSERT INTO idempotency_keys (key, status, expires_at)
  VALUES ($1, 'pending', now() + interval '24 hours')
  ON CONFLICT (key) DO NOTHING
  RETURNING *;
`, [key]);

Best for: transactional with the operation; survives Redis outage.

Option C: In-memory (single-instance only)

const seen = new LRUCache({ max: 10000, ttl: 86400000 });

Best for: single-server apps; NOT serverless.

Option D: Provider-managed

  • Stripe: pass idempotencyKey option
  • Inngest / Vercel Queues: built-in dedup

Pros: zero implementation Cons: provider-specific

The 90% answer:

  • API endpoints: Redis with TTL
  • Webhook receivers: Postgres table (transactional with side effect)
  • Queue consumers: provider-managed if available; otherwise Postgres
  • Single-instance: in-memory

For my system:

  • Storage choice per entry point
  • TTL settings
  • Cleanup strategy

Output:

  1. The storage matrix
  2. The implementation patterns
  3. The cleanup cron

The biggest storage mistake: **in-memory in serverless.** Function instance restarts; state lost; same key processed twice. The fix: external storage for serverless.

## Concurrent Calls — The Race Condition

Help me handle concurrent calls.

The race:

Two requests with same key arrive within ms. Without proper handling:

  • A: looks up key; not found
  • B: looks up key; not found
  • Both process → DUPLICATE WORK

The atomic acquire pattern:

Use SET-IF-NOT-EXISTS (Redis SETNX or Postgres INSERT ... ON CONFLICT):

const acquired = await redis.set(`idem:${key}`, 'pending', 'NX', 'EX', 86400);
if (!acquired) {
  // Another request in flight or completed
}

Atomic: only one request "wins"; others see existing key.

In-flight handling options:

Option 1: Reject immediately

HTTP 409 Conflict
"Request with same idempotency key in progress"

Option 2: Wait with timeout

for (let i = 0; i < 300; i++) {
  await sleep(100);
  const status = await redis.get(`idem:${key}`);
  if (status !== 'pending') return JSON.parse(status);
}
throw new Error('Idempotent request still processing');

Stripe''s approach: 409 Conflict.

Post-completion return:

After first request completes, response is cached. Concurrent requests that see "completed" status return cached response.

Failure handling:

Don''t cache failures; allow retry:

try {
  const result = await fn();
  await redis.set(`idem:${key}`, JSON.stringify(result), 'EX', 86400);
} catch (e) {
  await redis.del(`idem:${key}`);
  throw e;
}

For my system:

  • Concurrent-call handling per endpoint
  • Reject vs wait policy
  • Failure semantics

Output:

  1. The concurrent-call handling
  2. The HTTP status codes
  3. The failure-retry semantics

The biggest concurrency mistake: **non-atomic check-then-act.** Two concurrent requests both pass the check. The fix: atomic acquire (SETNX or INSERT-ON-CONFLICT).

## Different Body, Same Key — The Conflict Case

Help me handle key-with-different-body.

Scenario: same key, different request body. What to do?

Three policies:

Policy A: Strict — reject mismatched body (Stripe default)

const stored = await getIdempotencyRecord(key);
if (stored && hashBody(req.body) !== stored.body_hash) {
  throw new Error('Idempotency key reused with different parameters');
}

Pros: catches programming errors; prevents replay attacks Cons: false positives on irrelevant field changes

Policy B: Lenient — return original response regardless

Don''t check body. Return original.

Pros: simple Cons: hides bugs; security risk

Policy C: Compare semantically-relevant fields

Hash only meaningful fields; skip request_id, timestamps.

The 90% answer: Policy A (strict). Stripe does this.

Implementation:

async function withIdempotency(key, body, fn) {
  const bodyHash = sha256(JSON.stringify(canonicalize(body)));
  const stored = await getIdempotencyRecord(key);

  if (stored) {
    if (stored.body_hash !== bodyHash) {
      throw new HTTPError(422, 'idempotency_error',
        'Same idempotency key, different request');
    }
    return stored.response;
  }

  const result = await fn();
  await saveIdempotencyRecord(key, bodyHash, result);
  return result;
}

Canonicalize step:

function canonicalize(obj) {
  if (Array.isArray(obj)) return obj.map(canonicalize);
  if (typeof obj === 'object' && obj !== null) {
    return Object.keys(obj).sort().reduce((acc, key) => {
      acc[key] = canonicalize(obj[key]);
      return acc;
    }, {});
  }
  return obj;
}

For my system:

  • Policy choice
  • Hash strategy
  • Canonicalization

Output:

  1. The mismatched-body policy
  2. The hash + canonicalize implementation
  3. The error response

The biggest mismatched-body mistake: **silently returning original response.** Different intent silently dropped. The fix: strict policy by default; surface the error.

## Cleanup and TTL

Help me design cleanup.

TTL per use case:

Use case TTL
API endpoint (general) 24 hours
Webhook receiver 7 days
Queue consumer Until ack''d
Long workflow 7-30 days
Payment / charge 24 hours (Stripe)

Cleanup:

For Redis: TTL automatic.

For Postgres: cron deletes expired:

-- Run hourly
DELETE FROM idempotency_keys WHERE expires_at < now();

Per cron-scheduled-tasks-chat.

Monitoring:

  • Idempotency-key count
  • Hit rate (% of requests hitting existing key)
  • 409 mismatched-body rate

If 409 rate > 1%: clients have bugs. If hit rate > 5%: lots of legitimate retries; system handling well.

For my system:

  • TTL per endpoint
  • Cleanup strategy
  • Monitoring

Output:

  1. The TTL policy
  2. The cleanup cron
  3. The monitoring dashboard

The biggest cleanup mistake: **no cleanup; table grows forever.** The fix: hourly cleanup cron.

## Test for Idempotency

Help me test idempotency.

Tests:

  1. Single call works
  2. Duplicate call returns same response
  3. Different key creates new resource
  4. Same key, different body returns 422
  5. Concurrent same-key requests produce one
  6. Failed request can be retried
  7. Expired key allows new request
  8. Webhook receiver dedupes by event ID
test('duplicate idempotency key returns same response', async () => {
  const key = uuid();
  const r1 = await api.post('/charges', { amount: 100 }, { idempotencyKey: key });
  const r2 = await api.post('/charges', { amount: 100 }, { idempotencyKey: key });
  expect(r2.body.id).toBe(r1.body.id);
  expect(await db.charges.count()).toBe(1);
});

test('concurrent same-key calls produce one charge', async () => {
  const key = uuid();
  const promises = Array(10).fill(null).map(() =>
    api.post('/charges', { amount: 100 }, { idempotencyKey: key })
  );
  await Promise.all(promises);
  expect(await db.charges.count()).toBe(1);
});

For my codebase:

  • Test coverage gaps
  • CI integration

Output:

  1. The test plan
  2. The CI integration

The biggest testing mistake: **only happy-path tested.** Concurrent / mismatched / expired untested. The fix: explicit edge-case tests.

## Avoid Common Pitfalls

The idempotency mistake checklist.

Mistake 1: No keys at all

  • Duplicates everywhere
  • Fix: keys on all mutating endpoints

Mistake 2: Server-generated keys

  • Can''t deduplicate retries
  • Fix: client-generated

Mistake 3: In-memory storage in serverless

  • State lost on instance restart
  • Fix: Redis or Postgres

Mistake 4: Non-atomic acquire

  • Race condition
  • Fix: SETNX / INSERT-ON-CONFLICT

Mistake 5: Cache failures

  • Failed response cached; retries return failure
  • Fix: don''t cache failures

Mistake 6: No TTL

  • Storage grows forever
  • Fix: 24h-7d TTL

Mistake 7: Same key, different body silently dropped

  • Hides bugs
  • Fix: 422 error on mismatch

Mistake 8: Webhook receiver without dedup

  • Process same event twice
  • Fix: dedupe by event ID

Mistake 9: Queue consumer without dedup

  • Duplicate jobs on at-least-once
  • Fix: dedupe by message ID

Mistake 10: No cleanup cron

  • Table grows
  • Fix: hourly cleanup

The quality checklist:

  • All mutating endpoints accept idempotency key
  • Client-generated UUIDs
  • Atomic acquire
  • Mismatched-body returns 422
  • Failures don''t cache
  • Per-endpoint TTL
  • Cleanup cron / TTL
  • Webhook receivers dedupe by event ID
  • Queue consumers dedupe by message ID
  • Edge-case test coverage

For my system:

  • Audit
  • Top 3 fixes

Output:

  1. Audit results
  2. Top 3 fixes
  3. The "v2 idempotency" plan

The single most-common mistake: **assuming clients won''t retry.** Production traffic always retries. Idempotency is table-stakes infrastructure.

---

## What "Done" Looks Like

A working idempotency system in 2026 has:

- All mutating endpoints accept `Idempotency-Key` header
- Client generates UUIDs; server uses them
- Atomic acquire (Redis SETNX or Postgres ON CONFLICT)
- Same body required for same key (422 on mismatch)
- Failures don''t cache; allow retry
- Per-use-case TTL (24h API; 7d webhooks)
- Cleanup cron for Postgres tables
- Webhook receivers dedupe by event ID
- Queue consumers dedupe by message ID
- Edge-case test coverage

The hidden cost of weak idempotency: **incidents at the worst times.** Customer charged twice on Black Friday (5x normal traffic; 5x retries); duplicate orders during marketing campaign; webhook storm processing 3x events. These are when lack of idempotency becomes catastrophic. Cheap insurance: few lines per endpoint; saves a Sev-1 every quarter.

## See Also

- [Outbound Webhooks](outbound-webhooks-chat.md) — webhook receivers need dedup
- [Inbound Webhooks](inbound-webhooks-chat.md) — receiving with dedup
- [Cron Jobs & Scheduled Tasks](cron-scheduled-tasks-chat.md) — cron idempotency
- [Public API](public-api-chat.md) — API idempotency keys
- [API Versioning](api-versioning-chat.md) — versioned API
- [Caching Strategies](caching-strategies-chat.md) — adjacent
- [Database Connection Pooling](database-connection-pooling-chat.md) — adjacent
- [Audit Logs](audit-logs-chat.md) — adjacent
- [Rate Limiting & Abuse](rate-limiting-abuse-chat.md) — adjacent
- [Service Level Agreements](service-level-agreements-chat.md) — uptime depends
- [Refunds & Chargebacks](refunds-chargebacks-chat.md) — payment idempotency
- [Dunning & Failed Payments](dunning-failed-payments-chat.md) — retry idempotency
- [VibeReference: Stripe](https://www.vibereference.com/auth-and-payments/stripe) — Stripe idempotency
- [VibeReference: Background Jobs Providers](https://www.vibereference.com/backend-and-data/background-jobs-providers) — queue dedup
- [VibeReference: Database Providers](https://www.vibereference.com/backend-and-data/database-providers) — storage choice

[⬅️ Day 6: Grow Overview](README.md)