Idempotency Patterns: Build Endpoints That Can Safely Be Called Twice

⬅️ Day 6: Grow Overview

If you're running a SaaS in 2026, every API endpoint that mutates state needs to handle the case where it gets called twice with the same intent. Network blips retry; webhooks fire twice; users double-click "submit"; queues redeliver. Most founders ignore this until a customer reports being charged twice for the same purchase, or seeing duplicate orders, or an email going out 3 times. Then it becomes a Sev-1 incident at 11pm.

A working idempotency strategy answers: which endpoints need it, what's the idempotency key, where do we store the deduplication state, what's the TTL, and how do we handle conflicts. Done well, retries are safe; double-clicks don't double-charge; webhooks can replay without consequence. Done badly, your support team explains "yes, you were charged twice; we're fixing it manually" once a week.

This guide is the implementation playbook for idempotent operations — when to use, key design, storage patterns, edge cases (different request body with same key), and the discipline that prevents the most painful class of production bugs. Companion to Outbound Webhooks, Cron Jobs & Scheduled Tasks, and Public API.

Why Idempotency Matters

Help me understand idempotency.

The definition:

An idempotent operation produces the same result whether called once or N times.

For HTTP APIs:
- GET / HEAD: idempotent (safe to retry; no state change)
- PUT / DELETE: idempotent in spec
- POST / PATCH: NOT idempotent in spec; need explicit handling
- POST creates often need idempotency keys

**The "double-call" sources**:

**1. Network retry**: client sends; gets timeout; doesn''t know if server got it; retries
**2. User double-click**: nothing happens visibly; user clicks again
**3. Webhook replay**: providers (Stripe, etc.) retry up to 3 days
**4. Queue redelivery**: at-least-once queues redeliver on consumer failure
**5. Browser back-button**: user submits; goes back; submits again
**6. Mobile app cache**: queues offline; sends queued request

**Cost of non-idempotent**:

- Charge: customer charged twice
- Email: customer receives N emails
- Order: duplicate order created
- Notification: spammed
- Vote: counted twice

For each: support ticket; refund; manual fix; angry customer.

**The fix is cheap**:

Idempotency keys add ~5 lines of code per endpoint. Cheap insurance.

For my system:
- Mutating endpoints inventory
- Current idempotency state
- Recent duplicate-related incidents

Output:
1. The mutating-endpoints inventory
2. The "is this idempotent?" audit
3. The risk priority

The biggest unforced error: assuming network calls happen once. Production networks are lossy; clients retry; queues redeliver. Idempotency is mandatory infrastructure, not optional.

When You Need Idempotency Keys

Help me identify where idempotency keys belong.

**High-risk (always need keys)**:

- Payment / charge / refund
- Email / notification sends
- Order / record creation
- Webhook receivers
- Queue consumers

**Medium-risk (often need keys)**:

- Profile / settings updates
- Counter increments
- Status changes
- Permission grants

**Low-risk (often don''t)**:

- Read-only operations (GET / HEAD)
- Idempotent-by-nature (PUT with full state; SET-style)
- Stateless transformations

**The "natural idempotency" pattern**:

```sql
-- Idempotent (UPSERT)
INSERT INTO users (email, name)
VALUES ('bob@example.com', 'Bob')
ON CONFLICT (email) DO UPDATE SET name = EXCLUDED.name;

-- Run twice; same result

vs.

-- Not idempotent (creates duplicates)
INSERT INTO orders (user_id, amount) VALUES ($1, $2);

-- Run twice; two orders

For my system:

Mutating-endpoints classification
Current idempotency coverage
Priority gaps

Output:

The endpoint classification
The idempotency-needed list
The implementation priority


The biggest classification mistake: **assuming "internal API" doesn''t need idempotency.** Internal services retry too; background jobs redeliver too. Classify by side-effect, not audience.

## The Idempotency Key Design

Help me design idempotency keys.

The requirements:

1. Unique per intent: same key = same intent 2. Generated by client: server can''t generate (too late) 3. Sufficient entropy: UUID v4 standard 4. Not the resource ID: chicken-and-egg 5. Bounded TTL: 24h-7d typical

The key sources:

Source	Pattern
HTTP API client	`Idempotency-Key` header (UUID)
Webhook receiver	Provider event ID (Stripe `evt_xxx`)
Queue consumer	Message ID + content hash
Form submission	Hidden field with form-instance UUID
Mobile app	Per-action UUID stored locally

The Stripe convention:

POST /v1/charges
Idempotency-Key: 8e3d1f5a-...

amount=1000&currency=usd&...

Stripe stores response for 24 hours. Subsequent calls with same key return original.

The webhook-receiver convention:

async function handleStripeWebhook(event) {
  if (await alreadyProcessed(event.id)) return;
  await processEvent(event);
  await markProcessed(event.id);
}

The queue-consumer convention:

async function processJob(message) {
  if (await alreadyProcessed(message.id)) {
    await message.ack();
    return;
  }
  await doWork(message);
  await markProcessed(message.id);
  await message.ack();
}

The form-submission convention:

<form>
  <input type="hidden" name="idempotency_key" value="${uuid()}">
  ...
</form>

For my system:

Key sources for each entry point
TTL strategy
Convention documentation

Output:

The key-source matrix
The TTL policy
The header / field conventions


The biggest key-design mistake: **server-generated keys.** Server creates UUID upon receipt; can''t deduplicate retries (each retry gets a new UUID). The fix: client generates; server uses.

## The Storage Layer

Help me pick the storage layer.

Option A: Redis with TTL (most common)

async function withIdempotency(key, fn) {
  const acquired = await redis.set(`idem:${key}`, 'pending', 'NX', 'EX', 86400);
  if (!acquired) {
    const value = await redis.get(`idem:${key}`);
    if (value === 'pending') throw new Error('In progress');
    return JSON.parse(value);
  }
  try {
    const result = await fn();
    await redis.set(`idem:${key}`, JSON.stringify(result), 'EX', 86400);
    return result;
  } catch (e) {
    await redis.del(`idem:${key}`);  // allow retry
    throw e;
  }
}

Best for: most use cases.

Option B: Postgres table with TTL

CREATE TABLE idempotency_keys (
  key TEXT PRIMARY KEY,
  user_id UUID,
  status TEXT,
  response JSONB,
  created_at TIMESTAMPTZ DEFAULT now(),
  expires_at TIMESTAMPTZ NOT NULL
);

const result = await db.query(`
  INSERT INTO idempotency_keys (key, status, expires_at)
  VALUES ($1, 'pending', now() + interval '24 hours')
  ON CONFLICT (key) DO NOTHING
  RETURNING *;
`, [key]);

Best for: transactional with the operation; survives Redis outage.

Option C: In-memory (single-instance only)

const seen = new LRUCache({ max: 10000, ttl: 86400000 });

Best for: single-server apps; NOT serverless.

Option D: Provider-managed

Stripe: pass idempotencyKey option
Inngest / Vercel Queues: built-in dedup

Pros: zero implementation Cons: provider-specific

The 90% answer:

API endpoints: Redis with TTL
Webhook receivers: Postgres table (transactional with side effect)
Queue consumers: provider-managed if available; otherwise Postgres
Single-instance: in-memory

For my system:

Storage choice per entry point
TTL settings
Cleanup strategy

Output:

The storage matrix
The implementation patterns
The cleanup cron


The biggest storage mistake: **in-memory in serverless.** Function instance restarts; state lost; same key processed twice. The fix: external storage for serverless.

## Concurrent Calls — The Race Condition

Help me handle concurrent calls.

The race:

Two requests with same key arrive within ms. Without proper handling:

A: looks up key; not found
B: looks up key; not found
Both process → DUPLICATE WORK

The atomic acquire pattern:

Use SET-IF-NOT-EXISTS (Redis SETNX or Postgres INSERT ... ON CONFLICT):

const acquired = await redis.set(`idem:${key}`, 'pending', 'NX', 'EX', 86400);
if (!acquired) {
  // Another request in flight or completed
}

Atomic: only one request "wins"; others see existing key.

In-flight handling options:

Option 1: Reject immediately

HTTP 409 Conflict
"Request with same idempotency key in progress"

Option 2: Wait with timeout

for (let i = 0; i < 300; i++) {
  await sleep(100);
  const status = await redis.get(`idem:${key}`);
  if (status !== 'pending') return JSON.parse(status);
}
throw new Error('Idempotent request still processing');

Stripe''s approach: 409 Conflict.

Post-completion return:

After first request completes, response is cached. Concurrent requests that see "completed" status return cached response.

Failure handling:

Don''t cache failures; allow retry:

try {
  const result = await fn();
  await redis.set(`idem:${key}`, JSON.stringify(result), 'EX', 86400);
} catch (e) {
  await redis.del(`idem:${key}`);
  throw e;
}

For my system:

Concurrent-call handling per endpoint
Reject vs wait policy
Failure semantics

Output:

The concurrent-call handling
The HTTP status codes
The failure-retry semantics


The biggest concurrency mistake: **non-atomic check-then-act.** Two concurrent requests both pass the check. The fix: atomic acquire (SETNX or INSERT-ON-CONFLICT).

## Different Body, Same Key — The Conflict Case

Help me handle key-with-different-body.

Scenario: same key, different request body. What to do?

Three policies:

Policy A: Strict — reject mismatched body (Stripe default)

const stored = await getIdempotencyRecord(key);
if (stored && hashBody(req.body) !== stored.body_hash) {
  throw new Error('Idempotency key reused with different parameters');
}

Pros: catches programming errors; prevents replay attacks Cons: false positives on irrelevant field changes

Policy B: Lenient — return original response regardless

Don''t check body. Return original.

Pros: simple Cons: hides bugs; security risk

Policy C: Compare semantically-relevant fields

Hash only meaningful fields; skip request_id, timestamps.

The 90% answer: Policy A (strict). Stripe does this.

Implementation:

async function withIdempotency(key, body, fn) {
  const bodyHash = sha256(JSON.stringify(canonicalize(body)));
  const stored = await getIdempotencyRecord(key);

  if (stored) {
    if (stored.body_hash !== bodyHash) {
      throw new HTTPError(422, 'idempotency_error',
        'Same idempotency key, different request');
    }
    return stored.response;
  }

  const result = await fn();
  await saveIdempotencyRecord(key, bodyHash, result);
  return result;
}

Canonicalize step:

function canonicalize(obj) {
  if (Array.isArray(obj)) return obj.map(canonicalize);
  if (typeof obj === 'object' && obj !== null) {
    return Object.keys(obj).sort().reduce((acc, key) => {
      acc[key] = canonicalize(obj[key]);
      return acc;
    }, {});
  }
  return obj;
}

For my system:

Policy choice
Hash strategy
Canonicalization

Output:

The mismatched-body policy
The hash + canonicalize implementation
The error response


The biggest mismatched-body mistake: **silently returning original response.** Different intent silently dropped. The fix: strict policy by default; surface the error.

## Cleanup and TTL

Help me design cleanup.

TTL per use case:

Use case	TTL
API endpoint (general)	24 hours
Webhook receiver	7 days
Queue consumer	Until ack''d
Long workflow	7-30 days
Payment / charge	24 hours (Stripe)

Cleanup:

For Redis: TTL automatic.

For Postgres: cron deletes expired:

-- Run hourly
DELETE FROM idempotency_keys WHERE expires_at < now();

Per cron-scheduled-tasks-chat.

Monitoring:

Idempotency-key count
Hit rate (% of requests hitting existing key)
409 mismatched-body rate

If 409 rate > 1%: clients have bugs. If hit rate > 5%: lots of legitimate retries; system handling well.

For my system:

TTL per endpoint
Cleanup strategy
Monitoring

Output:

The TTL policy
The cleanup cron
The monitoring dashboard


The biggest cleanup mistake: **no cleanup; table grows forever.** The fix: hourly cleanup cron.

## Test for Idempotency

Help me test idempotency.

Tests:

Single call works
Duplicate call returns same response
Different key creates new resource
Same key, different body returns 422
Concurrent same-key requests produce one
Failed request can be retried
Expired key allows new request
Webhook receiver dedupes by event ID

test('duplicate idempotency key returns same response', async () => {
  const key = uuid();
  const r1 = await api.post('/charges', { amount: 100 }, { idempotencyKey: key });
  const r2 = await api.post('/charges', { amount: 100 }, { idempotencyKey: key });
  expect(r2.body.id).toBe(r1.body.id);
  expect(await db.charges.count()).toBe(1);
});

test('concurrent same-key calls produce one charge', async () => {
  const key = uuid();
  const promises = Array(10).fill(null).map(() =>
    api.post('/charges', { amount: 100 }, { idempotencyKey: key })
  );
  await Promise.all(promises);
  expect(await db.charges.count()).toBe(1);
});

For my codebase:

Test coverage gaps
CI integration

Output:

The test plan
The CI integration


The biggest testing mistake: **only happy-path tested.** Concurrent / mismatched / expired untested. The fix: explicit edge-case tests.

## Avoid Common Pitfalls

The idempotency mistake checklist.

Mistake 1: No keys at all

Duplicates everywhere
Fix: keys on all mutating endpoints

Mistake 2: Server-generated keys

Can''t deduplicate retries
Fix: client-generated

Mistake 3: In-memory storage in serverless

State lost on instance restart
Fix: Redis or Postgres

Mistake 4: Non-atomic acquire

Race condition
Fix: SETNX / INSERT-ON-CONFLICT

Mistake 5: Cache failures

Failed response cached; retries return failure
Fix: don''t cache failures

Mistake 6: No TTL

Storage grows forever
Fix: 24h-7d TTL

Mistake 7: Same key, different body silently dropped

Hides bugs
Fix: 422 error on mismatch

Mistake 8: Webhook receiver without dedup

Process same event twice
Fix: dedupe by event ID

Mistake 9: Queue consumer without dedup

Duplicate jobs on at-least-once
Fix: dedupe by message ID

Mistake 10: No cleanup cron

Table grows
Fix: hourly cleanup

The quality checklist:

For my system:

Audit
Top 3 fixes

Output:

Audit results
Top 3 fixes
The "v2 idempotency" plan


The single most-common mistake: **assuming clients won''t retry.** Production traffic always retries. Idempotency is table-stakes infrastructure.

---

## What "Done" Looks Like

A working idempotency system in 2026 has:

- All mutating endpoints accept `Idempotency-Key` header
- Client generates UUIDs; server uses them
- Atomic acquire (Redis SETNX or Postgres ON CONFLICT)
- Same body required for same key (422 on mismatch)
- Failures don''t cache; allow retry
- Per-use-case TTL (24h API; 7d webhooks)
- Cleanup cron for Postgres tables
- Webhook receivers dedupe by event ID
- Queue consumers dedupe by message ID
- Edge-case test coverage

The hidden cost of weak idempotency: **incidents at the worst times.** Customer charged twice on Black Friday (5x normal traffic; 5x retries); duplicate orders during marketing campaign; webhook storm processing 3x events. These are when lack of idempotency becomes catastrophic. Cheap insurance: few lines per endpoint; saves a Sev-1 every quarter.

## See Also

- [Outbound Webhooks](outbound-webhooks-chat.md) — webhook receivers need dedup
- [Inbound Webhooks](inbound-webhooks-chat.md) — receiving with dedup
- [Cron Jobs & Scheduled Tasks](cron-scheduled-tasks-chat.md) — cron idempotency
- [Public API](public-api-chat.md) — API idempotency keys
- [API Versioning](api-versioning-chat.md) — versioned API
- [Caching Strategies](caching-strategies-chat.md) — adjacent
- [Database Connection Pooling](database-connection-pooling-chat.md) — adjacent
- [Audit Logs](audit-logs-chat.md) — adjacent
- [Rate Limiting & Abuse](rate-limiting-abuse-chat.md) — adjacent
- [Service Level Agreements](service-level-agreements-chat.md) — uptime depends
- [Refunds & Chargebacks](refunds-chargebacks-chat.md) — payment idempotency
- [Dunning & Failed Payments](dunning-failed-payments-chat.md) — retry idempotency
- [VibeReference: Stripe](https://www.vibereference.com/auth-and-payments/stripe) — Stripe idempotency
- [VibeReference: Background Jobs Providers](https://www.vibereference.com/backend-and-data/background-jobs-providers) — queue dedup
- [VibeReference: Database Providers](https://www.vibereference.com/backend-and-data/database-providers) — storage choice

[⬅️ Day 6: Grow Overview](README.md)