Customer Reports & Scheduled Exports — Chat Prompts
If your B2B SaaS shows customers any data at all, sooner or later they'll ask: "Can I get this as a report?" Then: "Can it be emailed to me weekly?" Then: "Can I export to CSV / Excel for my own analysis?" Then: "Can my CFO get a monthly summary automatically?" The naive shape: "View" page in the app; customer screenshots it. Doesn't scale; loses formatting; not on the cadence they need. The right shape: a customer-facing reports + scheduled-export system that lets users define what data + format they want, schedule delivery (email / Slack / S3 / API), and trust the report will arrive consistently.
Customer reports are different from internal admin dashboards (you build dashboards FOR YOUR TEAM about customers; reports are for CUSTOMERS about THEIR own data). They're also different from embedded analytics (interactive in-product dashboards). Reports are generated artifacts: PDF, CSV, scheduled email digests with embedded charts, data feeds.
This chat walks through implementing real customer reports + scheduled exports: report definition + design, generation pipeline (CSV / PDF / email), scheduling, delivery channels, custom-formatted reports, edge cases (large data, time zones, branding), and the operational realities.
What you're building
- Report definition system (what data, what filters, what format)
- One-shot exports (customer clicks "Export"; gets file)
- Scheduled reports (weekly/monthly/custom recurring)
- Multiple output formats (CSV, Excel, PDF, JSON, email-embed)
- Multiple delivery channels (download / email / Slack / S3 / webhook)
- Template-based PDF reports with branding
- Customer-facing UI for report management
- Operational runbooks for large/slow reports
1. Decide the scope
Help me decide what shape of reporting to ship.
Three increasingly-deep shapes:
LEVEL 1: BASIC EXPORT (the simplest start)
- "Export to CSV" button on tables
- Customer downloads file
- One-shot; no scheduling
- Pros: ships in 1-2 weeks
- Cons: no automation; no PDF; no branding
LEVEL 2: SCHEDULED REPORTS + EMAIL DELIVERY
- Customer defines a report (data + filters)
- Schedule (daily / weekly / monthly)
- Delivered via email (CSV attached or PDF rendered or summary embedded)
- Pros: hands-off automation; high customer value
- Cons: 4-8 weeks; email-rendering challenges; PDF generation infra
LEVEL 3: BRANDED PDF REPORTS + MULTI-CHANNEL DELIVERY
- Pixel-perfect PDF reports with customer branding
- Delivery to Slack / S3 / webhook / custom destination
- Custom report templates per workspace
- Pros: enterprise-grade; high stickiness
- Cons: 12-24 weeks; branding system complex
LEVEL 4: SELF-SERVE REPORT BUILDER
- Customer drags fields, defines filters, picks chart types
- Visual report builder (Looker-Studio-style)
- Pros: ultimate flexibility
- Cons: 24+ weeks; build a query engine; high engineering investment
DEFAULT FOR MOST B2B SaaS:
- Year 1-2: Level 1 (CSV export)
- Year 2+: Level 2 (scheduled + email)
- Year 3+: Level 3 (branded PDF) when enterprise customers demand
- Skip Level 4 (use embedded analytics like Cube + Tremor; not a custom builder)
Output: scope statement; what's in v1.
Output: scope decision.
2. Design the report definition data model
Schema for reports:
reports (
id uuid pk
workspace_id uuid not null
created_by uuid not null
name text not null
description text
type text -- 'tabular', 'summary', 'chart', 'composite'
data_source text -- 'projects', 'deals', 'tickets', etc.
filters jsonb -- query filters
fields jsonb -- which columns to include
group_by jsonb -- aggregation grouping
sort jsonb
output_format text -- 'csv', 'xlsx', 'pdf', 'json', 'html_email'
template_id uuid -- optional PDF template (Level 3)
schedule jsonb -- null (one-shot) | { cron, timezone, next_run_at }
delivery jsonb -- { type: 'email'|'slack'|'webhook'|'s3', config: {...} }
recipients uuid[] -- users to deliver to
status text -- 'draft', 'active', 'paused', 'archived'
last_run_at timestamptz
last_run_status text -- 'success', 'failed', 'partial'
next_run_at timestamptz
created_at timestamptz
updated_at timestamptz
)
Indexes:
- (workspace_id, status='active') for run-due lookup
- (next_run_at) for scheduling
- (workspace_id, name)
report_runs (
id uuid pk
report_id uuid not null
triggered_at timestamptz
triggered_by text -- 'schedule', 'manual', 'api'
triggered_by_user uuid
started_at timestamptz
completed_at timestamptz
status text -- 'pending', 'running', 'completed', 'failed'
output_url text -- S3 / blob URL with TTL
output_size_bytes bigint
rows_generated bigint
error text
delivery_status jsonb -- per-recipient: { user_id, channel, status, error }
)
INDEX (report_id, triggered_at DESC)
INDEX (status='running' OR status='pending') for monitoring
Why store filters/fields/group-by as JSONB:
- Flexible; matches your data-model query patterns
- Easy to evolve schema without migration
- Validates at app layer
Why separate report (definition) from report_runs (execution):
- Report definition is stable + versioned
- Runs are events (history, audit, debug)
- Scaling: trim run history; keep definitions
Implement:
1. Migration for reports + report_runs
2. Filter / field / group-by JSON schemas
3. Validation at write
4. RLS for multi-tenancy
5. Indexes for run-scheduling queries
Output: schema for definition + execution.
3. Build the report generation pipeline
Now build the actual report execution.
Architecture:
1. Trigger (manual / scheduled / API)
2. Worker fetches report definition
3. Worker executes query (against your DB / data warehouse)
4. Worker generates output file (CSV / Excel / PDF)
5. Worker uploads to storage (S3 / blob)
6. Worker delivers via configured channel
7. Worker updates report_runs
Use a durable workflow runtime: Inngest / Trigger.dev / Temporal / Vercel Workflow.
Pseudo-code:
async function executeReport(reportId: string, triggeredBy: TriggerInfo) {
const report = await db.reports.findById(reportId)
const run = await db.reportRuns.create({
report_id: reportId,
triggered_at: new Date(),
triggered_by: triggeredBy.type,
status: 'pending',
})
try {
run.started_at = new Date()
run.status = 'running'
await db.reportRuns.update(run)
// Step 1: Execute query
const data = await queryData(report)
// Step 2: Generate output
const output = await generateOutput(data, report.output_format, report.template_id)
// Step 3: Upload to storage
const url = await uploadToBlob(output, {
filename: `${report.name}-${formatDate(new Date())}.${ext(report.output_format)}`,
ttl: 30 * DAY, // signed URL expires
contentType: contentTypeFor(report.output_format),
})
run.output_url = url
run.output_size_bytes = output.size
run.rows_generated = data.length
// Step 4: Deliver
const deliveryResults = await deliverReport(report, run, url)
run.delivery_status = deliveryResults
// Step 5: Update status
run.status = 'completed'
run.completed_at = new Date()
await db.reportRuns.update(run)
// Step 6: Schedule next run (if recurring)
if (report.schedule) {
await db.reports.update({ id: reportId, next_run_at: computeNextRun(report.schedule) })
}
} catch (err) {
run.status = 'failed'
run.error = err.message
await db.reportRuns.update(run)
throw err
}
}
Key concerns:
A. Query timeout
- Reports may need to read millions of rows
- Set per-report timeout (default 5 min; configurable)
- Use streaming queries; don't load all into memory
- For very large: paginate the export across multiple files
B. Memory management
- Stream CSV / Excel / PDF generation
- Don't build entire output in memory for large reports
- Use libraries that support streaming (papaparse for CSV; exceljs for Excel; @react-pdf/renderer or puppeteer for PDF)
C. Worker isolation
- Reports run in background jobs (not request path)
- Per-worker timeout; failure-handling
- Scale workers separately from API
D. Storage + URL TTL
- Upload to S3/blob with signed URL (default 30 days)
- Store URL + run metadata
- Link expires; customer can request re-generation if needed
E. Idempotency
- Use deterministic file names per (report_id, triggered_at)
- If retry: use same filename (overwrite); avoid duplicates
Implement:
1. Query function (streams data per report config)
2. Output generators per format (CSV, Excel, PDF)
3. Delivery dispatchers per channel
4. Workflow runtime integration
5. Error handling + retry
6. Memory + timeout safeguards
Output: a generation pipeline that scales.
4. Implement output formats
Per-format generation:
CSV:
- Use streaming library (papaparse / fast-csv)
- Headers row from field definitions
- Encode special chars correctly
- BOM for Excel-compatibility (UTF-8 BOM)
- Stream-write to file or blob
Excel (XLSX):
- Use exceljs (Node) or similar
- Stream-write rows
- Multiple sheets if needed (one per data slice)
- Number formatting (currency, dates)
- Frozen header row
PDF:
- Two approaches:
a. HTML → PDF (Puppeteer / @sparticuz/chromium / Browserless)
- Render React components to HTML; print to PDF
- Best for visually-rich reports with branding
- Slower (1-3 seconds per page)
b. Programmatic (PDFKit / @react-pdf/renderer)
- Build PDF instructions directly
- Faster; less visual flexibility
- Pick: HTML→PDF for branded reports; programmatic for table dumps
For HTML→PDF on Vercel:
- Use @sparticuz/chromium with puppeteer-core in Node.js function
- Or: external service (Browserless / DocRaptor / PDFShift)
- Or: rendering-as-a-service (e.g., HTMLcss-to-pdf)
Sample PDF rendering with Puppeteer:
import puppeteer from 'puppeteer-core'
import chromium from '@sparticuz/chromium'
async function generatePDFReport(report, data) {
const html = await renderReportHTML(report, data)
const browser = await puppeteer.launch({
args: chromium.args,
executablePath: await chromium.executablePath(),
headless: true,
})
const page = await browser.newPage()
await page.setContent(html, { waitUntil: 'networkidle0' })
const pdf = await page.pdf({
format: 'A4',
printBackground: true,
margin: { top: '20mm', bottom: '20mm', left: '15mm', right: '15mm' },
})
await browser.close()
return pdf
}
JSON:
- Stream-write JSON to file
- Use ndjson for streaming-friendly format
- Best for API consumers / data ingestion
HTML email:
- Rendered template with embedded charts (as <img> from Cloudinary or Mailtrap-rendered)
- Inline CSS (use juice.js or similar)
- Email-client-safe HTML (test in Litmus / Email on Acid)
Implement:
1. CSV generator (streaming)
2. Excel generator (streaming)
3. PDF generator (Puppeteer or programmatic)
4. JSON generator
5. HTML-email generator
6. Format validation (don't try to PDF a 100K-row dataset)
Output: format generators that handle real data.
5. Implement delivery channels
Per-delivery:
EMAIL DELIVERY:
- Use your email provider (Resend / Postmark / SendGrid)
- Subject: "{Report name} - {date}"
- Body: brief summary + download link OR inline content
- Attachments: CSV / PDF (size-limited; some providers cap 10-25MB)
- For large: link only; no attachment
- Per-recipient sending (each user gets their own email)
- Track: bounces, opens, link clicks
SLACK DELIVERY:
- Use Slack Web API (chat.postMessage with file upload)
- Customer connects Slack to their workspace via OAuth
- Channel selected per report
- Format: structured message with chart preview + download link
- Use Slack's "blocks" format for rich rendering
S3 / OBJECT STORAGE DELIVERY:
- Customer provides S3 bucket + IAM role / access keys
- Worker uploads to their bucket
- Path-pattern: /reports/{date}/{report-name}.csv
- For enterprise customers running data pipelines
WEBHOOK DELIVERY:
- Customer provides webhook URL
- Worker POSTs report data + metadata to URL
- Format: JSON with download link OR full data
- Retry with backoff on failure
- Sign payloads (HMAC) so customer can verify
DOWNLOAD-ONLY:
- No automatic delivery
- Customer triggers + downloads via UI
- Generated URL valid 30 days
API DELIVERY:
- Customer requests via API (GET /api/reports/[id]/runs/[run_id]/data)
- Returns latest run data
- Useful for customer-side automation
DELIVERY CONFIGURATION (per channel):
Email:
{ "channel": "email", "config": { "subject_template": "{report.name} - {date}" } }
Slack:
{ "channel": "slack", "config": { "channel_id": "C123", "include_summary": true } }
S3:
{ "channel": "s3", "config": { "bucket": "...", "path_pattern": "...", "credentials_secret_id": "..." } }
Webhook:
{ "channel": "webhook", "config": { "url": "...", "auth_type": "hmac", "secret_id": "..." } }
Implement:
1. Per-channel dispatcher
2. Configuration UI per channel
3. Connection testing (verify Slack OAuth, S3 access, webhook URL)
4. Retry logic per channel
5. Failure notifications to customer
Output: multi-channel delivery.
6. Build the customer-facing UI
Customer-side UI for reports.
Page: /reports
Layout:
- List of reports (grouped: scheduled / one-shot)
- "Create report" button → wizard
- Per-report row: name, schedule, last-run status, next-run-time, actions
Create-report wizard:
Step 1: Choose data source
- Pick from available data: Projects, Deals, Tickets, Users, Custom (if custom-fields)
Step 2: Define filters
- Visual filter builder (similar to saved-views)
- "Status = Open" "Created in last 30 days"
Step 3: Pick fields / columns
- Multi-select from available fields
- Reorder via drag-drop
Step 4: Group by + sort
- Optional aggregation
- "Group by Owner; Sum of Value"
Step 5: Pick output format
- CSV / Excel / PDF / JSON
- Preview if possible
Step 6: Schedule
- "Run once now" OR
- Recurring: Daily / Weekly / Monthly / Custom cron
- Time + timezone
- Day of week (for weekly)
Step 7: Delivery
- Email (pick recipients)
- Slack (pick channel)
- S3 (configure)
- Webhook (configure)
- Download only
Step 8: Confirm
- Summary
- "Run once now" button
- "Save + activate"
Run history per report:
- Timestamp + status
- Download link (if not expired)
- Re-run button
Edit report:
- Same wizard; pre-filled
- Pause / unpause / archive
Implement:
1. Reports list page
2. Create-report wizard
3. Per-step UX (filter builder, field picker, etc.)
4. Run history view
5. Edit / archive flow
6. Connection-setup flows (Slack OAuth, S3 creds, etc.)
Output: a UX customers actually use.
7. Operational + edge cases
Walk me through:
1. Report timeout (large dataset)
- Default timeout 5 min
- For larger: paginate output (multiple files)
- Or: ETL-style approach (export to S3 in chunks)
- Email customer with progress link
2. Email size limit (CSV > 25 MB)
- Don't attach; link only
- Customer downloads via signed URL
- TTL longer for large files
3. Customer's Slack token revoked
- Detection: API returns auth_error
- Notify customer via in-app + email
- Pause report; require re-auth
4. S3 credentials expired / invalid
- Detection: upload error
- Mark delivery failed
- Retry next scheduled run; if persistent: notify
5. Webhook returns non-2xx
- Retry with exponential backoff (3 attempts; 5min, 30min, 2hr)
- After 3 failures: mark failed; notify customer
6. Customer changes timezone
- Existing schedule interpreted in customer's NEW timezone (or remain in old; pick a rule)
- Confusing; document clearly
- Recommend: schedule in IANA tz; tz from user's profile
7. Customer adds 50 reports each running daily
- Quota: max active scheduled reports per workspace per plan tier
- Free: 5; Pro: 25; Enterprise: 100
- Plan-tier check at creation
8. Report fails repeatedly (e.g., bad query)
- After 3 consecutive failures: pause report; notify customer
- Diagnose: query error / data missing / permission
9. Workspace deleted; reports orphaned
- Cascade-delete all reports + runs
- Don't orphan delivery configs
10. Time-zone shifts (DST)
- Schedule in IANA tz; honor DST automatically
- Don't use UTC offsets directly
11. Customer wants historical data
- Default report scope: based on filter (e.g., last 30 days)
- Backfill: customer requests one-shot run for past period
12. Sensitive data in reports (PII)
- Same access controls as in-product
- Don't email PII to non-authorized recipients
- Encrypt at rest in S3 if applicable
13. Audit + compliance
- Log every report run + delivery
- Customer audit-export of report-run history
14. Multi-currency / multi-language
- Per-recipient localization
- Currency from workspace / user settings
For each: code change + UX impact + ops consideration.
Output: a system that handles real customer load.
8. Recap
What you've built:
- Report definition data model + schema
- Generation pipeline (durable workflow)
- Output formats: CSV, Excel, PDF, JSON, HTML email
- Delivery channels: email, Slack, S3, webhook, download
- Customer-facing wizard + reports list
- Per-tier quotas
- Audit + run-history
- Operational handling (timeouts, retries, failures)
What you're explicitly NOT shipping in v1:
- Visual drag-drop report builder (Level 4; defer; use embedded analytics)
- Custom SQL editor (defer; security concerns)
- Charts in-product (use embedded analytics; see Embedded Analytics Platforms (Reference))
- Customer-uploadable PDF templates (Level 3; defer)
- Multi-tenant report-shared-across-customers (anti-pattern)
Ship Level 1 (CSV export) in 1-2 weeks. Add Level 2 (scheduled + email) in 4-8 weeks. Add Level 3 (branded PDF) when enterprise customers ask + pay.
The biggest mistake teams make: building a custom visual report builder. Tons of engineering for narrow value; embedded analytics platforms (Cube + Tremor / Explo / Metabase Embedded) do this better. Use them for interactive analytics; use scheduled reports for delivered artifacts.
The second mistake: PDF generation in request path. PDFs take seconds to render; do in background.
The third mistake: not planning for large data. CSV export of 1M rows blows up memory if you load all at once. Stream-write from day one.
See Also
- CSV Import — sister discipline (import not export)
- PDF Generation In-App — pairs (PDF generation tech)
- Background Jobs & Queue Management — depended-upon
- Email Template Implementation — pairs for HTML email reports
- Email Deliverability — pairs
- Cron / Scheduled Tasks — depended-upon for scheduling
- Outbound Webhooks — pairs for webhook delivery
- Audit Logs — pairs for run auditing
- Customer Analytics Dashboards — adjacent (interactive vs delivered)
- Charts & Data Visualization — adjacent
- Internal Admin Tools — sister category (your team's reports)
- Bulk Operations — adjacent
- File Uploads — pairs for upload of templates
- Embedded Analytics Platforms (Reference) — alternative for interactive analytics
- Account Deletion & Data Export — different scope (GDPR full-account)
- Workspace Branding & Custom Domains — pairs for branded PDF templates
- Multi-Tenancy — workspace boundary
- Roles & Permissions — who can create/edit reports
- Quotas, Limits & Plan Enforcement — per-tier report quotas