WebSocket & Server-Sent Events (SSE) Implementation: Real-Time Connections That Don't Wake You Up at 3am

⬅️ Day 6: Grow Overview

If you're building a SaaS in 2026 with any real-time feature — live notifications, presence indicators, status updates, streaming AI responses, dashboards that update without refresh — you need persistent connections. Most founders default to "we'll just use WebSocket for everything," then six months in discover that WebSockets break behind corporate proxies, your serverless host charges per-second-connected, scaling beyond 10K concurrent connections requires re-architecture, and reconnection logic is its own engineering project.

A working real-time-connections strategy answers: WebSocket vs SSE vs long-polling, how do we authenticate connections, how do we reconnect on failure, how do we scale beyond a single server, and how do we know when something's broken. Done well, real-time features feel magical and stay invisible. Done badly, you're debugging "why didn't the customer get the update?" tickets every week.

This guide is the implementation playbook for real-time connections — picking the right pattern, authentication, reconnection, scaling, observability, and cost discipline. Distinct from Real-Time Collaboration (CRDT-based multiplayer specifically).

Pick the Right Pattern: WebSocket vs SSE vs Polling

Each pattern has different tradeoffs.

Help me pick a real-time pattern.

The four options:

**1. WebSocket (bidirectional, persistent)**

Client ←→ Server (continuous)


- Bidirectional (client can send AND receive)
- Persistent connection
- Lower overhead per message (after connection)
- Requires upgrade from HTTP

Pros:
- Bidirectional (chat, collaboration)
- Lowest per-message latency
- Industry standard for real-time

Cons:
- Connection state on server (memory)
- Doesn''t work behind some proxies
- More complex to scale
- Authentication trickier

Use for:
- Chat / messaging
- Collaboration / multiplayer
- Bidirectional control (terminals, games)
- Anything truly real-time + interactive

**2. Server-Sent Events / SSE (server-to-client only)**

Client ← Server (one-way stream)


- One-way (server pushes to client)
- HTTP-based (works through proxies)
- Native browser API (`EventSource`)
- Auto-reconnects

Pros:
- Simpler than WebSocket
- HTTP-friendly (proxies, CDN)
- Auto-reconnect built in
- Easy authentication (cookies / headers)

Cons:
- One-way only (server → client)
- 6-connection-per-domain browser limit
- Less common; some tooling weaker

Use for:
- Notifications (server pushes)
- Live updates / dashboards
- AI streaming responses
- Any server-pushed updates without need to receive from client

**3. Long-polling (HTTP request that holds open)**

Client → Server (request) Server holds for N seconds Server returns when data available OR timeout Client immediately sends new request


Pros:
- Pure HTTP (works everywhere)
- Simple
- Compatible with all proxies

Cons:
- Higher latency than WS / SSE
- More overhead per message
- Server must handle long-held connections

Use for:
- Fallback when WS / SSE not supported
- Very low message rate
- Legacy compatibility

**4. Short polling (request every N seconds)**

Client → Server (every 5 seconds)


Pros:
- Trivially simple
- Stateless
- Cacheable

Cons:
- Latency = polling interval
- Wasteful at scale
- Battery drain on mobile

Use for:
- Status polls (every 30-60 seconds)
- When sub-second latency not needed
- Trivial implementation worth it

**The pattern-by-use-case matrix**:

| Use case | Best pattern |
|---|---|
| Chat / messaging | WebSocket |
| Collaboration (CRDT) | WebSocket (per [real-time-collaboration](real-time-collaboration-chat.md)) |
| Live notifications | SSE |
| AI streaming response | SSE |
| Dashboard updates | SSE |
| Presence indicators | WebSocket (bidirectional heartbeat) |
| Game / terminal | WebSocket |
| Job status check | Polling |
| Webhook delivery | Long-polling fallback |

**The 90% answer for indie SaaS**:

- SSE for server-to-client (notifications, AI streaming)
- WebSocket for bidirectional (chat, collaboration)
- Polling for low-frequency status

Skip long-polling unless legacy compatibility forces it.

For my product:
- Real-time use cases inventory
- Pattern per use case
- Current implementation

Output:
1. The use-case inventory
2. The pattern choice per use case
3. The migration plan if needed

The biggest unforced error: WebSocket for everything. "Real-time = WebSocket" — but most "real-time" needs are server-to-client (notifications); SSE is simpler, more reliable, easier to scale. The fix: SSE-default; WebSocket only for bidirectional needs.

SSE Implementation (the underused choice)

For most "push from server" use cases, SSE is right.

Help me implement SSE.

The basic pattern:

**Server (Node.js / Hono / Express)**:

```typescript
app.get('/api/events', authenticate, async (req, res) => {
  res.writeHead(200, {
    'Content-Type': 'text/event-stream',
    'Cache-Control': 'no-cache',
    'Connection': 'keep-alive',
    'X-Accel-Buffering': 'no',  // disable nginx buffering
  });

  const userId = req.user.id;
  const channel = `events:user:${userId}`;

  // Subscribe to Redis pub/sub
  const subscriber = redis.duplicate();
  await subscriber.subscribe(channel);

  subscriber.on('message', (_channel, message) => {
    res.write(`data: ${message}\n\n`);
  });

  // Heartbeat every 30s
  const heartbeat = setInterval(() => {
    res.write(': heartbeat\n\n');
  }, 30000);

  // Cleanup on disconnect
  req.on('close', () => {
    clearInterval(heartbeat);
    subscriber.unsubscribe();
    subscriber.quit();
  });
});

Client (browser):

const eventSource = new EventSource('/api/events');

eventSource.onmessage = (event) => {
  const data = JSON.parse(event.data);
  // Handle update
};

eventSource.onerror = () => {
  // EventSource auto-reconnects; just log
  console.log('SSE error; will reconnect');
};

The "X-Accel-Buffering: no" trick:

Nginx and other proxies buffer responses by default. SSE needs streaming.

Set X-Accel-Buffering: no to disable buffering. Without this, messages arrive in batches.

The "heartbeat" pattern:

Send periodic comments (: heartbeat\n\n) every 30s:

Keeps connection alive through proxies
Detects broken connections (writes fail)
Detects client disconnect (close event)

Vercel-specific limitations:

Vercel Functions have execution time limits (300s default in 2026). Long-running SSE connections will be terminated.

Solutions:

Reconnect from client (EventSource does this automatically)
Use Vercel''s extended runtime modes
Or: deploy to traditional server for true long-lived connections
Or: use platform with dedicated WebSocket support (Pusher, Ably)

The "event names" pattern:

SSE supports event types:

event: notification
data: {"id": "1", "text": "Hello"}

event: presence
data: {"user_id": "123", "status": "online"}

Client:

eventSource.addEventListener('notification', (e) => {...});
eventSource.addEventListener('presence', (e) => {...});

Cleaner than putting type in payload.

Authentication:

EventSource sends cookies automatically. Use cookie-based auth.

For token-based: pass via query string (less secure) OR use proxy that adds auth header:

new EventSource('/api/events?token=' + jwt);

For my SSE:

Endpoints to implement
Backend pub/sub layer
Reconnection strategy

Output:

The SSE implementation
The authentication approach
The reconnection plan


The biggest SSE mistake: **forgetting to disable proxy buffering.** Messages arrive in 10-message batches every 30 seconds; "real-time" feels broken. The fix: `X-Accel-Buffering: no` header.

## WebSocket Implementation

For bidirectional use cases, WebSocket is right.

Help me implement WebSocket.

The basic pattern:

Server (using ws library on Node.js):

import { WebSocketServer } from 'ws';

const wss = new WebSocketServer({ port: 8080 });

wss.on('connection', async (ws, req) => {
  // Authenticate
  const token = new URL(req.url, 'http://localhost').searchParams.get('token');
  const user = await verifyToken(token);
  if (!user) {
    ws.close(1008, 'Unauthorized');
    return;
  }

  ws.userId = user.id;

  // Subscribe to user-specific channel
  const subscriber = redis.duplicate();
  await subscriber.subscribe(`events:user:${user.id}`);

  subscriber.on('message', (_channel, message) => {
    ws.send(message);
  });

  ws.on('message', async (data) => {
    const msg = JSON.parse(data);
    // Handle client message (e.g., chat, action)
    await processMessage(user.id, msg);
  });

  // Heartbeat (ping/pong)
  ws.isAlive = true;
  ws.on('pong', () => { ws.isAlive = true; });

  ws.on('close', () => {
    subscriber.unsubscribe();
    subscriber.quit();
  });
});

// Detect dead connections
const interval = setInterval(() => {
  wss.clients.forEach((ws) => {
    if (!ws.isAlive) return ws.terminate();
    ws.isAlive = false;
    ws.ping();
  });
}, 30000);

Client:

class ReconnectingWebSocket {
  ws: WebSocket;
  reconnectAttempts = 0;

  constructor(url: string) {
    this.connect(url);
  }

  connect(url: string) {
    this.ws = new WebSocket(url);

    this.ws.onopen = () => {
      this.reconnectAttempts = 0;
    };

    this.ws.onmessage = (event) => {
      const data = JSON.parse(event.data);
      this.handleMessage(data);
    };

    this.ws.onclose = () => {
      const delay = Math.min(1000 * Math.pow(2, this.reconnectAttempts), 30000);
      setTimeout(() => {
        this.reconnectAttempts++;
        this.connect(url);
      }, delay);
    };
  }
}

const ws = new ReconnectingWebSocket(`wss://api.example.com/ws?token=${jwt}`);

The "ws library" choice:

Popular options:

ws: most-popular Node.js WebSocket library
socket.io: higher-level; fallbacks; multi-room
uWebSockets.js: high-performance C++-bound

For most: ws (simple) or socket.io (feature-rich).

Vercel limitations:

Vercel Functions don''t support persistent WebSocket connections (timeout 300s).

Use one of:

Vercel + external WebSocket provider (Pusher, Ably, PartyKit)
Hosted WebSocket platform (Soketi, Centrifugo)
Traditional server for WebSocket (separate from Vercel app)

The room / channel pattern:

Group connections by room:

const rooms = new Map<string, Set<WebSocket>>();

ws.on('message', (data) => {
  const { type, room } = JSON.parse(data);
  if (type === 'join') {
    if (!rooms.has(room)) rooms.set(room, new Set());
    rooms.get(room).add(ws);
    ws.room = room;
  }
});

function broadcast(room: string, message: any) {
  rooms.get(room)?.forEach(ws => {
    ws.send(JSON.stringify(message));
  });
}

Or: use Redis pub/sub for distributed broadcasting.

The reconnection state:

After reconnect:

Re-authenticate
Re-subscribe to channels
Sync missed messages (request "since last seen ID")

Without state-aware reconnection: client misses messages during disconnect.

For my WebSocket:

Server-side library
Auth approach
Reconnection state-restore

Output:

The WebSocket implementation
The Vercel-friendly architecture
The reconnection strategy


The biggest WebSocket mistake: **deploying WebSocket on serverless.** Vercel function timeout terminates connections; client constantly reconnects; awful UX. The fix: external WebSocket provider OR traditional server for the WebSocket layer.

## Authentication for Persistent Connections

Auth is trickier for WS / SSE than for HTTP.

Help me handle authentication.

The challenges:

WS upgrade is HTTP; can use headers
After upgrade: no headers per message
SSE: native EventSource doesn''t support custom headers
Token expiration during long-lived connection

SSE auth options:

Option 1: Cookie-based (best for browsers)

// Cookie set on auth
res.cookie('session', sessionToken, { httpOnly: true, secure: true });

// EventSource sends cookie automatically
new EventSource('/api/events');

Option 2: Query-string token

new EventSource(`/api/events?token=${jwt}`);

// Server
const token = req.query.token;
const user = verifyJWT(token);

Less secure (logged in URL); but only option for non-browser clients.

Option 3: Polyfill EventSource with headers

Libraries like eventsource-polyfill add header support.

WebSocket auth options:

Option 1: Subprotocol-based

// Client
const ws = new WebSocket('wss://api.example.com', ['v1', `token-${jwt}`]);

// Server
const token = req.headers['sec-websocket-protocol'].split(',')
  .find(p => p.trim().startsWith('token-'));

Option 2: Query-string token

const ws = new WebSocket(`wss://api.example.com?token=${jwt}`);

Option 3: First-message auth

// On connection: client sends auth message first
ws.onopen = () => {
  ws.send(JSON.stringify({ type: 'auth', token: jwt }));
};

// Server: hold before processing other messages
ws.on('message', async (data) => {
  if (!ws.authenticated) {
    const { type, token } = JSON.parse(data);
    if (type !== 'auth') {
      ws.close(1008, 'Auth required');
      return;
    }
    const user = await verifyToken(token);
    if (!user) {
      ws.close(1008, 'Bad token');
      return;
    }
    ws.userId = user.id;
    ws.authenticated = true;
    return;
  }

  // Handle other messages
});

Token expiration:

JWT expires (usually 1 hour). Long-lived connection survives expiration.

Options:

Server checks expiration on each message; closes if expired
Client refreshes token; sends new token via message
Server enforces max connection duration (force reconnect)

Best: token refresh + reconnect cycle.

Tenant isolation:

For multi-tenant SaaS (per multi-tenancy-chat):

// Subscribe to tenant-scoped channel
const channel = `events:tenant:${user.tenantId}:user:${user.id}`;

NEVER:

Allow client to specify channel name (could subscribe to other tenant)
Skip tenant isolation in real-time layer

For my auth:

Token strategy
Refresh approach
Tenant isolation

Output:

The auth flow
The token refresh
The tenant-isolation enforcement


The biggest auth mistake: **token in URL logged in proxy logs.** JWT in `?token=...` ends up in nginx access logs forever. The fix: cookie-based for browser; subprotocol or first-message for clients that need it.

## Scaling Real-Time Connections

10K concurrent connections is when single-server breaks.

Help me scale real-time.

The single-server limit:

A Node.js process can handle:

~10K concurrent connections (memory + CPU)
~50K with optimizations
100K+ requires specialized stack (Go, Erlang, Rust)

Beyond single-server: distributed.

Horizontal scaling pattern:

Client ─┬─ Server 1 ─┐
        ├─ Server 2 ─┼─ Redis (pub/sub)
        └─ Server 3 ─┘

Multiple servers
Redis (or NATS / Kafka) for pub/sub
Client connects to any server
Server publishes; all servers receive; deliver to their connected clients

Implementation:

// Server-side
const subscriber = redis.duplicate();
await subscriber.subscribe('events:*');

subscriber.on('message', (channel, message) => {
  // Find connected clients for this channel
  const clients = getLocalClientsForChannel(channel);
  clients.forEach(ws => ws.send(message));
});

// Publishing
async function notifyUser(userId, message) {
  await redis.publish(`events:user:${userId}`, JSON.stringify(message));
  // All servers receive; only server with this user''s connection delivers
}

Sticky sessions:

If using load balancer, sticky sessions help (client always connects to same server). Less critical with Redis pub/sub but improves perf.

Connection caps per server:

Set max connections per instance:

Node.js: tune Node memory; ~5-10K typical
Beyond: scale out

Hosted alternatives (skip the scaling problem):

Pusher — managed real-time channels
Ably — modern alternative
PartyKit (Cloudflare) — edge real-time
Soketi — self-hosted Pusher-compatible
Centrifugo — modern self-hosted

Pros: zero scaling work Cons: per-message or per-connection cost

For most indie SaaS: hosted is right until volume justifies self-host.

The "channels" architecture:

Organize messages by channel:

events:user:{user_id}        → personal notifications
events:tenant:{tenant_id}    → tenant-wide events
events:room:{room_id}        → chat rooms
events:dashboard:{dash_id}   → dashboard updates

Each connection subscribes to relevant channels.

Avoid the "broadcast everything" anti-pattern:

Don''t broadcast every event to every connection. Filter at server side based on subscription.

For my system:

Current connection count
Scaling needs
Hosted vs self-host

Output:

The architecture
The scaling plan
The hosted alternative


The biggest scaling mistake: **vertical scaling forever.** "Just bigger server" works to ~10K connections; breaks beyond. The fix: horizontal scaling + Redis pub/sub OR hosted real-time provider. Plan from day one.

## Reconnection: The 80% of Real-Time Code

Connections drop. Plan for it.

Help me handle reconnection.

The patterns:

Auto-reconnect with backoff:

class ReconnectingClient {
  delays = [1000, 2000, 5000, 10000, 30000];
  attempt = 0;

  connect() {
    this.ws = new WebSocket(url);
    this.ws.onopen = () => { this.attempt = 0; };
    this.ws.onclose = () => {
      const delay = this.delays[Math.min(this.attempt, this.delays.length - 1)];
      setTimeout(() => {
        this.attempt++;
        this.connect();
      }, delay);
    };
  }
}

Exponential backoff prevents thundering herd.

Resubscribe after reconnect:

this.ws.onopen = () => {
  this.attempt = 0;
  // Restore subscriptions
  for (const channel of this.subscribedChannels) {
    this.ws.send(JSON.stringify({ type: 'subscribe', channel }));
  }
};

Catch up missed messages:

this.ws.onopen = () => {
  // Send last-seen-message-id
  this.ws.send(JSON.stringify({
    type: 'sync',
    last_id: localStorage.getItem('last_message_id'),
  }));
};

this.ws.onmessage = (event) => {
  const msg = JSON.parse(event.data);
  localStorage.setItem('last_message_id', msg.id);
  // Process
};

Server tracks recent messages (Redis stream or DB); replays from last-id on reconnect.

The "online / offline" UX:

Show users connection status:

const [connected, setConnected] = useState(false);

useEffect(() => {
  ws.onopen = () => setConnected(true);
  ws.onclose = () => setConnected(false);
}, []);

return (
  <div>
    {!connected && <Banner>Reconnecting...</Banner>}
    {/* ... */}
  </div>
);

Better than silent failure.

Network change handling:

When user switches networks (WiFi → cellular):

Browser fires online / offline events
Force reconnect on online

window.addEventListener('online', () => {
  if (this.ws.readyState !== WebSocket.OPEN) {
    this.connect();
  }
});

Heartbeat to detect zombie connections:

setInterval(() => {
  if (lastMessageReceived < Date.now() - 60000) {
    this.ws.close();
    this.connect();
  }
}, 30000);

Without: zombie connections stay "open" silently for hours.

For my client:

Reconnection strategy
Resubscribe + catch-up
Connection-status UX

Output:

The reconnection client
The catch-up strategy
The UX patterns


The biggest reconnection mistake: **no reconnection logic at all.** Connection drops; never reconnects; user thinks app is broken; refreshes. The fix: auto-reconnect with backoff + resubscribe + missed-message catchup.

## Observability for Real-Time

Real-time bugs are harder to debug. Monitor.

Help me observe real-time connections.

The metrics:

Connection metrics:

Active connection count
Connections per second (rate)
Average connection duration
Disconnect rate
Reconnect rate per client

Message metrics:

Messages sent per second (server → client)
Messages received per second (client → server)
Average message latency
Message size distribution

Error metrics:

Connection errors / failures
Authentication failures
Message parsing errors
Send failures

Per-channel metrics:

Subscribers per channel
Messages per channel

Tools:

Datadog real-time monitoring
Pusher / Ably built-in dashboards
PostHog event tracking
Custom dashboard via Redis stats

The "ping-pong heartbeat" log:

Log heartbeat success/failure to detect:

Network issues
Server overload
Client browser tab backgrounded (browsers throttle)

The "disconnect reason" tracking:

ws.on('close', (code, reason) => {
  log.info('ws.close', { user_id, code, reason: reason.toString() });
});

Standard codes:

1000: normal close
1001: going away (browser closed)
1006: abnormal close (network)
1008: policy violation (auth fail)
1011: server error

Per-connection-cost tracking:

If using hosted (Pusher / Ably):

Per-message cost
Per-channel cost
Concurrent connection cost

Track to avoid surprise bills.

For my observability:

Metrics tracked
Tools used
Alert thresholds

Output:

The metrics dashboard
The alerting
The cost tracking


The biggest observability mistake: **no real-time-specific metrics.** Connection issues invisible; debugging "why didn''t the customer get the update?" with no data. The fix: connection / message / error metrics; per-channel breakdown.

## Avoid Common Pitfalls

Recognizable failure patterns.

The real-time mistake checklist.

Mistake 1: WebSocket for everything

SSE simpler for server-push
Fix: SSE for one-way; WS for bidirectional

Mistake 2: WebSocket on serverless

Function timeout kills connections
Fix: external provider or traditional server

Mistake 3: No reconnection logic

Drop = silent failure
Fix: auto-reconnect with backoff

Mistake 4: No proxy-buffering disable

SSE messages batched
Fix: X-Accel-Buffering: no

Mistake 5: No heartbeat

Zombie connections
Fix: ping/pong every 30s

Mistake 6: No tenant isolation

Cross-tenant data leak
Fix: tenant-scoped channels

Mistake 7: Token in URL logged

Security risk
Fix: cookie or subprotocol

Mistake 8: Single-server scaling

Breaks at 10K connections
Fix: Redis pub/sub + horizontal

Mistake 9: No catch-up on reconnect

Lost messages
Fix: resync from last-id

Mistake 10: No observability

Bugs invisible
Fix: metrics dashboard

The quality checklist:

For my system:

Audit
Top 3 fixes

Output:

Audit results
Top 3 fixes
The "v2 real-time" plan


The single most-common mistake: **assuming real-time is the same as request-response.** Persistent connections have completely different operational characteristics: scaling, auth, reconnection, observability. The fix: treat real-time as its own discipline; learn the patterns; plan from day one.

---

## What "Done" Looks Like

A working real-time-connection system in 2026 has:

- Pattern matched to use case (SSE for one-way; WS for bidirectional)
- Authentication appropriate (cookie / subprotocol / first-message)
- Reconnection with exponential backoff + catch-up
- Heartbeat / dead-connection detection
- Tenant-isolated channels
- Token refresh during long-lived connections
- Horizontal scaling via pub/sub OR hosted provider
- Connection-status UX (online / offline / reconnecting)
- Observability metrics (connections / messages / errors)
- Cost discipline (per-message tracking if hosted)

The hidden cost of weak real-time: **silent failures that erode trust.** Customer doesn''t see updates; assumes feature is broken; refreshes; eventually stops trusting "real-time" in the product. Real-time done right is invisible; done wrong is broken-feeling. Plan from day one; treat as its own discipline; the magic of "live updates" pays off in user delight.

## See Also

- [Real-Time Collaboration](real-time-collaboration-chat.md) — CRDT-based multiplayer
- [In-App Notifications](in-app-notifications-chat.md) — common SSE use case
- [Outbound Webhooks](outbound-webhooks-chat.md) — adjacent push pattern
- [Inbound Webhooks](inbound-webhooks-chat.md) — adjacent
- [AI Features Implementation](ai-features-implementation-chat.md) — SSE for AI streaming
- [Caching Strategies](caching-strategies-chat.md) — adjacent
- [Database Connection Pooling](database-connection-pooling-chat.md) — adjacent connection-management
- [Service Level Agreements](service-level-agreements-chat.md) — uptime depends on real-time
- [Performance Optimization](performance-optimization-chat.md) — real-time perf
- [Multi-Tenancy](multi-tenancy-chat.md) — tenant-isolation in channels
- [Audit Logs](audit-logs-chat.md) — adjacent
- [VibeReference: Vercel Functions](https://www.vibereference.com/cloud-and-hosting/vercel-functions) — Vercel runtime
- [VibeReference: Vercel Queues](https://www.vibereference.com/cloud-and-hosting/vercel-queues) — adjacent eventing
- [VibeReference: Background Jobs Providers](https://www.vibereference.com/backend-and-data/background-jobs-providers) — adjacent
- [VibeReference: Database Providers](https://www.vibereference.com/backend-and-data/database-providers) — Redis pub/sub

[⬅️ Day 6: Grow Overview](README.md)