Set Up Feature Flags for Safe Shipping

Feature Flags for Your New SaaS

Goal: Ship features behind runtime toggles so you can roll out gradually, kill instantly when something breaks, run A/B tests, and gate features by subscription tier — all without redeploying.

Process: Follow this chat pattern with your AI coding tool such as Claude or v0.app. Pay attention to the notes in [brackets] and replace the bracketed text with your own content.

Timeframe: 2 hours to first flag in production. Add to every feature you ship from week 2 of launch onwards.

The Four Use Cases You Actually Need

A flag is a runtime toggle that controls feature visibility without a redeploy. There are four jobs flags do for a small SaaS — pick the right type for the right feature.

Type	What it does	When to use
Kill switch	Disable a feature in seconds when it breaks	Anything touching payments, third-party APIs, AI inference, or anything that can spike costs
Gradual rollout	Ship to 10%, then 25%, then 100%	Big changes to existing flows, anything risky to revert via deploy
A/B test	Show variant A to half, B to other half, measure	Pricing changes, headline tests, onboarding step order
Entitlement	Gate features by tier or user attribute	Pro vs Free features, beta access, enterprise-only flows

Rule of thumb: every feature that touches money, AI inference, or third-party APIs ships behind a kill switch from day one. Skip the kill switch on a payment feature once and you'll regret it twice.

1. Pick a Tool

For a one-person AI company on Next.js / Vercel, the right pick is PostHog feature flags or Vercel Flags. Both are free at startup volumes, Next.js-native, and configured in under an hour.

I'm building a SaaS at [your-domain.com]. My stack is [Next.js App Router / React 19 / TypeScript], deployed on Vercel.

I need to add feature flags. My current scale is [N] daily active users and I expect to grow to [N] in 12 months.

Help me decide between:
1. **PostHog feature flags** — open source, unlimited flags on free tier, integrated with PostHog product analytics (which I already use / plan to use for activation tracking).
2. **Vercel Flags** — native to my deploy platform, zero-config for Next.js, Edge Config for sub-1ms reads.
3. **Statsig** — strong for A/B testing, generous free tier, separate analytics layer.

For each option, tell me:
- Setup time on my exact stack
- Cost trajectory at 1k / 10k / 100k MAU
- Whether server-side evaluation is supported (required for kill switches and entitlements)
- Any integration friction with my existing analytics tool

Recommend ONE based on my scale and stack. Default to PostHog unless there's a strong reason against.

For most readers in 2026, PostHog wins for a one-person team because it bundles flags with the analytics you'll already need for the activation funnel. Vercel Flags wins if you don't run PostHog and want platform-native integration.

2. Install and Configure

Install [PostHog feature flags / Vercel Flags] in my Next.js App Router app.

Generate:
1. The exact `npm install` command and any peer dependencies
2. The environment variables I need to set (and which environments they belong in: development, preview, production)
3. The initial provider setup in `app/layout.tsx` — wrap my app in the flag provider for client-side reads
4. A `lib/flags.ts` module that exposes a typed `getFlag(name, fallback)` function for server-side reads (use the server SDK, with caching via React's `cache()` or the SDK's built-in cache)
5. A simple smoke test: a flag named `smoke-test` that I can toggle to verify the integration works end-to-end

Use TypeScript. Show me the full file contents, not snippets.

Set up dev / preview / production keys separately — never share a production key with preview deploys.

3. Wrap Your First Real Feature

Pick a feature you're about to ship. Wrap it before you ship it, not after.

I'm shipping a new feature: [describe the feature in one sentence — e.g., "AI suggestions panel on the dashboard"].

The relevant files are:
- [path/to/component.tsx] — the React component
- [path/to/api/route.ts] — the API route it calls
- [path/to/lib/whatever.ts] — the underlying logic

Help me wrap this feature behind a kill switch named `[feature-name]-enabled` (default: false in production until I'm ready to ramp).

Show me the exact edits to:
1. The React component — wrap the UI in a flag check, render fallback (or nothing) when off
2. The API route — return 404 or a feature-disabled response when off
3. The lib function — short-circuit early when off, so even if the route is hit, no expensive work runs
4. A new test that asserts the feature is hidden when the flag is off

After this is in place, I should be able to flip the flag in [PostHog / Vercel] and have the feature appear/disappear within seconds, with no redeploy.

The pattern: flag check at every layer (UI, route, lib). Don't trust a single client-side check — anyone in DevTools can flip it locally.

4. Set Up the Gradual Rollout

I'm ready to ramp `[feature-name]-enabled` from 0% to 100% over 7 days.

Generate the rollout plan:
1. Day 1: 5% rollout, targeted at users on the highest-engagement cohort (e.g., users with >5 sessions in the last 7 days)
2. Day 2: 10%, broaden to all paying users
3. Day 4: 25% if no error spike or activation regression
4. Day 6: 50%
5. Day 7: 100%

For each ramp step, tell me:
- The exact targeting expression to put in [PostHog / Vercel] (e.g., "user.cohort = 'high_engagement' AND user.plan = 'pro'")
- The metrics I should watch on PostHog: error rate, activation rate for the affected feature, session length, any feature-specific event counts
- The kill criteria: "if metric X drops more than Y%, revert to 0% immediately"

Also generate a Slack alert config that pings me if any of the kill criteria fire automatically.

If you don't have engagement cohorts yet, ramp by user.id % 10 < N (deterministic random) instead. PostHog and Vercel both support this without you writing code.

5. Run an A/B Test

A/B testing uses the same flag system, but with measurement attached.

I want to A/B test [describe the change — e.g., "the new pricing page layout vs the current one"].

Set up an A/B test in [PostHog / Statsig] with:
1. Flag name: `[experiment-name]-variant` with values `control` and `treatment`
2. 50/50 split, deterministic by user_id (so the same user always sees the same variant)
3. Targeting: only authenticated users on the [pricing / signup / dashboard] page
4. Primary metric: [conversion rate / signup rate / activation rate]
5. Secondary metrics: [revenue per user, churn at 7 days, support ticket volume]

Generate:
- The flag config (JSON or UI screenshot description)
- The React component change to render the correct variant based on the flag value
- The exposure tracking event so the analytics tool knows which variant the user actually saw
- The minimum sample size needed for a 95% confidence call, given my current weekly traffic of [N] users on this page
- The expected duration of the test based on that sample size

Also tell me when I should call the test and which variant to ship. Don't peek at the results before the sample size is hit.

The "exposure tracking" line is critical — it's how the tool knows which users actually saw which variant, not just which got assigned.

6. Add Entitlement Flags for Pricing Tiers

I have three pricing tiers: [Free / Pro / Enterprise].

I need entitlement flags that gate features by tier, synced with my billing system [Stripe / Polar / Lemon Squeezy].

Help me set up:
1. A `tier_entitlements.ts` config that maps each feature name to the minimum tier required (e.g., `ai_advanced_suggestions: "pro"`, `team_seats: "enterprise"`)
2. A server-side `hasEntitlement(userId, featureName)` function that:
   - Reads the user's current tier from my user table or Stripe metadata
   - Returns true/false based on the entitlement config
   - Caches the result for the duration of the request
3. A React hook `useEntitlement(featureName)` that wraps the same logic for client components
4. A clear UI pattern for "feature locked, upgrade to access" — link to pricing page with the feature name as a UTM param
5. A webhook handler for when a user upgrades or downgrades — invalidate the entitlement cache and refresh their session

Don't trust the client. Every API route that gates a feature must call `hasEntitlement` server-side.

This pattern lets you launch a new pro feature without a deploy: add it to the config, set the flag default, ship.

7. The Flag Hygiene Checklist

Flags accumulate. A six-month-old flag with no owner is worse than no flag — it adds branches that nobody understands.

Run this monthly:

List all flags. For each, name an owner.
Any flag with both states fully rolled out (always on or always off in production) gets removed within 1 sprint.
No flag older than 90 days unless it's a permanent kill switch or entitlement.
Every kill switch has a documented "trigger" (what condition flips it) and a documented "test" (how we verified it works recently).
No more than [10–15] active rollout/A-B flags at any time. Above that, they interact in ways nobody can reason about.
Audit the flag SDK for stale evaluations: any flag that hasn't been read in 30 days is a candidate for removal.

Generate a monthly flag-hygiene report. Pull from my [PostHog / Vercel] account:
- Total flag count
- Flags older than 90 days
- Flags with both states identical in production (rolled out or rolled back fully)
- Flags with no read in 30 days

Format as a single Markdown table I can paste into a Linear ticket or Slack message.

Common Failure Modes

"We have 87 flags and nobody knows what they do." Flag debt. The fix is the hygiene checklist above. New flags from now on must have an owner, an expiration, and a kill criterion logged at creation time. Old flags get audited in batch.

"The kill switch didn't work because the API route wasn't checking the flag." Always check the flag at every layer. The UI, the route, and the lib function. Test the kill switch by flipping it in staging and confirming the feature actually disappears at all layers.

"The A/B test conflicted with another flag." When two flags target the same surface, results are uninterpretable. Run one experiment per surface at a time. If you have to run two, segment the audience explicitly so they're disjoint.

"Free-tier users can see the pro feature." You evaluated entitlements client-side only. Always check server-side too — clients can be tampered with or stale.

"The flag SDK takes 200ms on cold start." Use the SDK's local evaluation mode, or pre-warm by reading flags in middleware.ts. Sub-1ms reads matter for kill switches because users will hit the broken code path before the flag resolves.