VibeWeek
Home/Grow/CAPTCHA & Bot Protection: Stop Bots Without Tormenting Real Users

CAPTCHA & Bot Protection: Stop Bots Without Tormenting Real Users

⬅️ Day 6: Grow Overview

If you're shipping a SaaS in 2026, your sign-up form, login, password reset, contact form, and anywhere else with public input are getting hit by bots. The volume is up — credential stuffing, AI-driven scrapers, fake-account farms, and reset-spam attacks have automated past basic defenses. Most indie SaaS skip protection entirely until 50,000 fake signups arrive in one weekend, then panic-add reCAPTCHA v2 (the "select all crosswalks" version) and watch conversion drop 15%. The right answer isn't more friction — it's invisible bot detection that lets humans through and stops bots silently.

A working bot-protection setup answers: where do you place defenses (every public form? auth endpoints? everywhere?), which provider (Turnstile / hCaptcha / reCAPTCHA / Vercel BotID), how do you avoid hurting conversion (invisible challenges; only escalate when suspicious), how do you handle bypass (human-CAPTCHA-solvers cost ~$1/1000), and how do you measure ("CAPTCHA solved" is not the same as "bot blocked").

This guide is the implementation playbook for CAPTCHA and bot detection — provider choice, placement, escalation, and measurement. Companion to Rate Limiting & Abuse, Password Reset & Magic Link, Two-Factor Auth, and API Keys.

Why Bot Protection Matters

Get the threat model clear first.

Help me understand the threats.

The bot categories:

**1. Sign-up bots (account farms)**
Automated mass signup with disposable / generated emails.
Goal: free-tier abuse, spam, fraud staging.
Cost to you: trial slots filled with fake; fake accounts skew metrics; sometimes used to abuse referral programs.

**2. Credential stuffing**
Tries leaked username:password pairs on your login.
Cost: account takeover; legal exposure; customer trust collapse.

**3. Reset spam**
Hits "forgot password" for victim email; victim mailbox floods.
Cost: customer support load; degraded deliverability.

**4. Scrapers**
Automated extraction of public pricing, content, customer lists.
Cost: competitor intel leakage; bandwidth cost.

**5. Form spam**
Contact / signup / demo-request forms with garbage submissions.
Cost: sales team time wasted; CRM polluted.

**6. AI-content scrapers (NEW in 2024-2026)**
LLM training scrapers, AI agent crawlers.
Cost: ambiguous — mostly bandwidth + content theft concerns.

**7. Click fraud / impression fraud**
Automated clicks on ads; fake engagement.
Cost: marketing spend wasted; analytics polluted.

**8. Inventory hoarding**
Bots holding limited resources (event tickets, beta slots).
Cost: revenue loss; legit-customer experience harm.

**9. Vote / poll manipulation**
Public polls, "upvote" features, leaderboards manipulated.
Cost: trust in the product feature.

**10. AI-driven probing**
LLM-controlled agents testing your endpoints for vulnerabilities.
Cost: ongoing pen-test surface from a hostile party.

For my app:
- Public-input surfaces
- Threat priorities
- Volume today

Output:
1. Top threats
2. Worst-case scenarios
3. Defense priorities

The biggest unforced error: assuming bots are someone else's problem until they're yours. By the time fake signups are noticeable in your dashboard, they've already polluted your activation metrics, eaten free-tier resources, and possibly driven a referral abuse spike. Defense should be in place before launch, not after the breach.

The 2026 Provider Landscape

Help me pick a provider.

The providers in 2026:

**Cloudflare Turnstile** (free)
- Mostly invisible challenges
- Free; unlimited
- Privacy-respecting (no cookie tracking like reCAPTCHA)
- Required: domain on Cloudflare or just use the API
- DEFAULT for most SaaS in 2026

**hCaptcha** (free / paid)
- Privacy-focused alternative to reCAPTCHA
- Free tier with their puzzle challenges
- Pro: revenue-share for showing harder puzzles
- Strong EU privacy story

**Google reCAPTCHA**
- v2 ("select all crosswalks") — DON'T USE in 2026; user-hostile
- v3 (invisible scoring) — fine; tracks via Google
- Enterprise tier exists with more controls
- Privacy concerns in EU contexts

**Vercel BotID** (GA June 2025)
- Native to Vercel deployments
- Verifies human via behavioral signals
- Integrates with Vercel Functions / middleware
- Free tier; pro pricing usage-based

**AWS WAF Bot Control**
- AWS-stack-native
- Mid-market; pricing per request
- Strong if AWS-locked

**Datadome / PerimeterX (HUMAN) / Imperva**
- Enterprise bot management
- Custom; $20K-$500K/yr
- For high-volume / high-risk targets (banks, ticketing)

**Friendly Captcha**
- Privacy-focused; EU-based
- Proof-of-work (no user interaction)
- Mid-market price

**Arkose Labs (formerly FunCaptcha)**
- Enterprise; gaming / financial focus
- 3D puzzles for high-stakes
- Custom pricing

**Behavioral-only options**:
- **Castle**, **SiftScience** — fraud-detection without CAPTCHA
- Score-based; integrate with risk thresholds
- Stronger for fraud signals beyond bot detection

For my stack:
- Cloud platform
- Privacy / compliance constraints
- Volume

Output:
1. Top 2 candidates
2. Why
3. Migration path

The 2026 default for indie/mid-market SaaS: Cloudflare Turnstile. Free, mostly invisible, no privacy issues, integrates everywhere. Switch to Vercel BotID if you're Vercel-locked and want platform-native; switch to Datadome if you're at $50M+ ARR and bots are a real ongoing battle.

Where to Place Defenses

Help me decide where to place CAPTCHA / bot detection.

The placement strategy:

**ALWAYS protect**:
- Sign-up endpoint (account farms target this)
- Login endpoint (credential stuffing targets this)
- Password reset request (reset-spam DoS)
- Contact / demo / sales form (form spam)
- Public comment / review submission (review spam)

**OFTEN protect**:
- API keys generation
- Email-change confirmation request
- Bulk operations (bulk download, bulk export)
- Coupon / promo code redemption (abuse target)

**RARELY protect**:
- Logged-in actions (already authenticated; rate-limit instead)
- Internal-admin endpoints (auth + IP-allowlist instead)
- Webhooks (signature validation instead)

**The escalation pattern**:

Not "CAPTCHA on every action" — that's user-hostile.
Tier the defense:

Tier 1 (always): bot-detection signal in background (Turnstile invisible) Tier 2 (suspicious): explicit challenge if signal is low Tier 3 (high-risk): puzzle or 2FA if signal is very low or behavior anomalous


Examples:
- Sign-up: Turnstile invisible. Rarely escalates.
- Login: Turnstile invisible. Escalates after 3 failures from same IP.
- Reset: Turnstile invisible. Escalates after 2 reset requests in 5 min.
- High-stakes (account-takeover-prone like billing change): always require 2FA, not CAPTCHA

**Don't double-up uselessly**:
Rate-limit + CAPTCHA on same endpoint is fine.
Two CAPTCHA providers on same form is wasteful.
2FA + CAPTCHA on logged-in form is overkill — pick one.

For my app: [public surfaces]

Output:
1. Per-endpoint placement
2. Escalation rules
3. User-impact estimate (% of legit users who'll see CAPTCHA)

The mistake to avoid: slapping reCAPTCHA v2 on every form. Conversion drop is real (10-30% on cold sign-up forms). Use invisible challenges for the 95% baseline; only escalate when signals warrant.

Implementing Cloudflare Turnstile (the Default)

Help me implement Turnstile.

Step 1: Sign up for Cloudflare; create a Turnstile site key.
Get sitekey + secret.

Step 2: Frontend (insert in form):

```html
<form action="/signup" method="POST">
  <input type="email" name="email" required />
  
  <!-- Turnstile widget (invisible by default) -->
  <div class="cf-turnstile" data-sitekey="YOUR_SITEKEY"></div>
  
  <button type="submit">Sign up</button>
</form>

<script src="https://challenges.cloudflare.com/turnstile/v0/api.js" async></script>

The widget injects a hidden input named cf-turnstile-response on submit.

Step 3: Backend verification:

// Next.js API route / app router action
import { NextRequest } from "next/server";

export async function POST(req: NextRequest) {
  const formData = await req.formData();
  const token = formData.get("cf-turnstile-response") as string;
  const ip = req.headers.get("CF-Connecting-IP") ?? req.headers.get("x-forwarded-for");

  // Verify with Cloudflare
  const verifyResp = await fetch(
    "https://challenges.cloudflare.com/turnstile/v0/siteverify",
    {
      method: "POST",
      headers: { "Content-Type": "application/x-www-form-urlencoded" },
      body: new URLSearchParams({
        secret: process.env.TURNSTILE_SECRET!,
        response: token,
        remoteip: ip ?? "",
      }),
    }
  );

  const verify = await verifyResp.json();
  if (!verify.success) {
    return new Response("Bot detected", { status: 403 });
  }

  // Proceed with signup logic
  // ...
}

Step 4: Mode selection:

  • Managed (default): Cloudflare picks invisible / interactive based on signals
  • Non-interactive: only invisible (best UX; weakest protection)
  • Invisible: hidden; no widget shown
  • Visible: always show challenge

For most SaaS: Managed.

Step 5: Test:

  • Test the happy path (your IP / browser): should pass invisibly
  • Test with curl (no browser context): should fail verification
  • Test with cf-turnstile-response=undefined: should fail

For my framework: [Next.js / Express / FastAPI / etc.]

Output:

  1. Frontend snippet
  2. Backend verification
  3. Tests

The Turnstile detail most teams miss: **always pass `remoteip` in the verification call**. Without it, you can't catch some replay attacks (token from one IP used on another). Also: the verification token is single-use — don't cache it.

## Vercel BotID (Vercel-Native)

Help me set up Vercel BotID.

Vercel BotID is platform-native; available since June 2025.

Configure in vercel.ts:

import { type VercelConfig } from '@vercel/config/v1';

export const config: VercelConfig = {
  framework: 'nextjs',
  botid: {
    enabled: true,
    routes: [
      { path: '/api/auth/signup', mode: 'enforce' },
      { path: '/api/auth/login', mode: 'enforce' },
      { path: '/api/auth/reset-request', mode: 'enforce' },
      { path: '/api/contact', mode: 'enforce' },
    ],
  },
};

Modes:

  • enforce: block bots; return 403
  • monitor: log but allow (good for measuring before enforcing)
  • off: disabled

In your handler, BotID adds headers:

export async function POST(req: NextRequest) {
  const isBot = req.headers.get('x-vercel-bot-id-decision') === 'bot';
  const score = req.headers.get('x-vercel-bot-id-score');
  
  if (isBot) {
    return new Response('Bot detected', { status: 403 });
  }
  
  // Proceed
}

Why BotID:

  • Zero JS injection (works on form posts and SSR)
  • Free tier on Vercel
  • No Cloudflare dependency
  • Single config in vercel.ts

Trade-offs:

  • Vercel-only
  • Newer; less battle-tested than Turnstile
  • Less granular than dedicated bot management

For Vercel users: BotID is often the simpler default.

For my app: [Vercel / not Vercel]

Output:

  1. vercel.ts config
  2. Handler integration
  3. Monitor → enforce migration plan

The Vercel-specific tip: **start in monitor mode for 1 week**. Watch the dashboard for false-positive rate. Switch to enforce only after you see clean signal.

## Behavioral Signals (The Cheap Layer)

Before you reach for CAPTCHA, layer in behavioral checks. They're free and catch a lot.

Help me set up behavioral checks.

Cheap signals to implement:

1. Honeypot field:

<input type="text" name="website" tabindex="-1" autocomplete="off" 
       style="position:absolute;left:-9999px" />

Hidden via CSS; humans never fill; bots that don't render CSS fill it.

Backend: if website is non-empty → bot.

Catch rate: 30-50% of dumb bots. Free.

2. Time-to-submit:

Bots fill forms in <1s; humans take 5-30s.

Frontend: store form-render-time on page load. Submit hidden field with elapsed time.

const renderedAt = Date.now();
form.addEventListener('submit', () => {
  hiddenField.value = (Date.now() - renderedAt).toString();
});

Backend: reject if elapsed_ms < 2000.

Catch rate: 20-40% of mid-tier bots.

3. JS-required:

Bots without JS execution can't fill JS-set fields.

Set a hidden field via JS on form mount:

hiddenField.value = 'js-rendered';

Backend: reject if missing.

Catch rate: 30-50% of basic bots.

4. Disposable-email check:

Block known disposable-email domains (mailinator, 10minutemail, tempmail). Use a list (https://github.com/disposable-email-domains/disposable-email-domains) or service (Kickbox, ZeroBounce).

Catch rate: 70-90% of trial-abuse bots.

5. Suspicious-user-agent:

Reject obvious bots: empty UA, "curl/", "python-requests", "Go-http-client". Don't be too aggressive — some legit headless / mobile UAs look weird.

6. Geo-anomaly:

Account from country incompatible with claimed locale / billing. Flag for review, don't auto-reject.

7. Velocity checks:

Same email signing up multiple times in 24h. Same IP signing up many accounts. Same payment method across accounts.

Stack these layers BEFORE CAPTCHA. Bot defense is layered:

Layer 1: Honeypot + time check + JS-required (free; 50%+ catch)
Layer 2: Disposable-email + UA check (free; +20%)
Layer 3: Turnstile / BotID (free; +25%)
Layer 4: Behavioral fingerprinting (Castle / Sift; paid; +5%)
Layer 5: Manual review / block list (humans; <5%)

For my app: [stack pick]

Output:

  1. Layer-by-layer setup
  2. Per-layer code snippets
  3. Combined config

The single highest-ROI addition: **honeypot field**. 5 lines of HTML, 5 lines of backend, catches 30-50% of dumb bots, zero impact on humans, zero ongoing cost. Add it before you do anything else.

## Email Verification: The Underrated Bot Defense

Help me wire email verification.

Email verification at signup is probably the single best bot defense.

Why: bots have to manage real email inboxes to pass. Most don't bother.

The flow:

  1. User signs up with email + password
  2. Backend creates user as verified=false
  3. Send email with verification link (token, 24h TTL)
  4. Block log-in / use until verified
  5. Re-send option after 60s; max 5 sends per email

Schema:

ALTER TABLE users ADD COLUMN email_verified_at TIMESTAMPTZ;

CREATE TABLE email_verification_tokens (
  token_hash CHAR(64) PRIMARY KEY,
  user_id UUID NOT NULL REFERENCES users,
  expires_at TIMESTAMPTZ NOT NULL,
  used_at TIMESTAMPTZ,
  ip_created INET,
  created_at TIMESTAMPTZ DEFAULT NOW()
);

Verification endpoint:

GET /verify?token=XYZ
- Hash token
- Look up; check expiration; check not-used
- Mark user.email_verified_at = NOW()
- Mark token used
- Auto-login (set session)
- Redirect to /onboarding

Anti-abuse:

  • Rate-limit signup (1 per email per hour)
  • Rate-limit verification email re-send (5 per email per day)
  • Block disposable email domains
  • Email + password optional; magic-link verifies email AND logs in (combine the steps)

For my app: [user model]

Output:

  1. Schema changes
  2. Endpoint flow
  3. Email content
  4. Anti-abuse

The principled framing: **email verification reduces bot signup by 70-90%** in most SaaS. The trade-off: small conversion drop (some legit users abandon). In 2026, the abandonment is small (5-10% per industry data) and the bot-defense gain is huge. Default to "yes, require verification."

## Measuring Bot Defense

Help me measure bot defense.

Metrics that matter:

Block rate: Bots blocked / total requests Healthy: 0.5-5% on auth endpoints; 10-30% if heavily targeted

False positive rate: Legit users blocked / legit users Target: < 0.5% Measure: support tickets "I can't sign up"; conversion drop after deploying

Bot bypass rate: Bots that got through / bots that tried Hard to measure directly; proxy via account-quality metrics

Account-quality metrics (proxy for bypass):

  • % of signups that activate
  • % of signups using disposable email (pre-block)
  • % of accounts deleted within 7 days (admin / cleanup)
  • Refund / chargeback rate
  • "Account banned" rate

Conversion: Signup → activation rate before vs after defense rollout If drop > 2%: tune defense; some legit users blocked

Volume signals: Sudden spike in blocked traffic → attack happening Use as alert; investigate

Tools:

  • Cloudflare Analytics (Turnstile dashboard)
  • Vercel Analytics (BotID dashboard)
  • Datadog / New Relic for custom metrics
  • Custom: log decisions; query weekly

The cadence:

  • Weekly: review block rate, false positives
  • Monthly: review account-quality metrics; tune thresholds
  • Quarterly: re-evaluate provider; check competitive landscape

For my stack: [observability tools]

Output:

  1. KPI dashboard fields
  2. Alert rules
  3. Quarterly review checklist

The metric most teams skip: **false-positive rate**. They deploy aggressive bot defense, conversion drops 10%, nobody connects the dots, and 6 months later they wonder why signup is slow. Measure conversion before/after; review support tickets for "I can't sign up" complaints.

## When CAPTCHA Isn't the Right Answer

Help me think about non-CAPTCHA defenses.

When NOT to add CAPTCHA:

1. Logged-in user actions User authenticated → use rate limit, not CAPTCHA. CAPTCHA on logged-in friction is user-hostile.

2. API endpoints with auth API key / OAuth + rate limit + audit log is enough. CAPTCHA breaks programmatic clients.

3. Webhooks Verify signature; CAPTCHA makes no sense (no human).

4. Internal admin tools Auth + IP-allowlist; CAPTCHA is theater here.

5. Power users / paid customers Annoying friction = churn. Trust verified payment as bot signal.

6. High-stakes actions Use 2FA (more secure than CAPTCHA) for: billing change, role change, password change.

7. Mobile apps CAPTCHA UX on mobile is bad. Use device-attestation (Apple App Attest / Android Play Integrity).

The pattern: CAPTCHA is for anonymous / first-touch / public-input. For everything else, use auth + rate limit + 2FA + audit log.

For my app: [endpoint inventory]

Output:

  1. Per-endpoint defense recommendation
  2. CAPTCHA vs alternatives
  3. Where you're over-CAPTCHA'd

The right framing in 2026: **CAPTCHA is one tool; not the tool**. Layer it with rate limiting, behavioral signals, email verification, 2FA. CAPTCHA alone is weak; CAPTCHA as part of layered defense is strong.

## What Done Looks Like

A working bot-protection setup delivers:
- Sign-up, login, reset, contact protected (the universal four)
- Honeypot + time-check on every public form
- Email verification required for new accounts
- Cloudflare Turnstile or Vercel BotID on auth endpoints (invisible; managed mode)
- Rate limits per IP / per email on auth endpoints
- 2FA for high-stakes (billing, security) operations
- Disposable-email blocklist on signup
- Bot-block dashboard reviewed weekly
- False-positive rate < 0.5%
- Bot block rate visible (5-30% on targeted endpoints)
- Quarterly defense review

The proof you got it right: a quiet sign-up form. No "CAPTCHA solve" friction for legit users; bot signups don't surface in your dashboard; account-quality metrics (activation, retention, refund) match human-only baselines.

## See Also

- [Rate Limiting & Abuse](rate-limiting-abuse-chat.md) — companion abuse-protection layer
- [Password Reset & Magic Link](password-reset-magic-link-chat.md) — uses CAPTCHA on reset request
- [Two-Factor Auth](two-factor-auth-chat.md) — alternative to CAPTCHA for high-stakes
- [SSO & Enterprise Auth](sso-enterprise-auth-chat.md) — enterprise auth often skips CAPTCHA
- [Social Login & OAuth](social-login-oauth-chat.md) — OAuth-signup pre-verifies email
- [API Keys](api-keys-chat.md) — API auth without CAPTCHA
- [Audit Logs](audit-logs-chat.md) — log bot decisions for review
- [Email Deliverability](email-deliverability-chat.md) — verification emails must arrive
- [VibeReference: Bot Detection Providers](https://vibereference.dev/devops-and-tools/bot-detection-providers) — vendor landscape
- [VibeReference: Auth Providers](https://vibereference.dev/auth-and-payments/auth-providers) — Clerk / Auth0 ship CAPTCHA built in