Skip to content

Migration: R2-first event storage (reduce D1 load)

Goal

Reduce D1 operations by ~95% by moving raw event storage fully to R2. Events are no longer written to the D1 events table during ingest; session aggregates are maintained on the sessions table and replay loads events exclusively from R2.

Summary of changes

BeforeAfter
Every event: INSERT into events, UPDATE sessions, GET session, then R2 writeEvents written only to R2; session aggregate update throttled (20s)
Replay: events from R2, fallback to D1 eventsReplay: events only from R2
Session list: LEFT JOIN events for event_countSession list: sessions.event_count
Analytics: events table for totals, top pages, realtimeAnalytics: SUM(sessions.event_count), top pages empty (R2 later)
Client: no session_started_atClient: sends session_started_at (from session/start); batch 50–200, interval 20s

Migration steps

1. Apply schema migration

Run migration 020_sessions_event_aggregates.sql to add columns to sessions:

  • event_count (INTEGER, default 0)
  • page_views (INTEGER, default 0)
  • clicks (INTEGER, default 0)
  • max_scroll_percent (REAL, default 0)
bash
# Local
./scripts/apply-local-migrations.sh

# Production (via your deployment pipeline)
./scripts/apply-production-migrations.sh

2. Deploy Worker and tracker

  1. Deploy the refactored Worker (no event writes to D1; R2-only event writes; throttled session updates; replay R2-only).
  2. The tracker script now sends session_started_at and uses a 20s batch interval with 50–200 events per batch. Ensure clients load the new tracker (e.g. cache-bust or versioned /tracker.js).

3. Backfill event_count (optional)

For existing sessions that already have rows in events, you can backfill sessions.event_count (and optionally page_views, clicks, max_scroll_percent) so list and analytics match historical data:

sql
UPDATE sessions
SET
  event_count = (SELECT COUNT(*) FROM events e WHERE e.session_id = sessions.session_id),
  page_views = (SELECT COUNT(*) FROM events e WHERE e.session_id = sessions.session_id AND e.event_type = 'page_view'),
  clicks = (SELECT COUNT(*) FROM events e WHERE e.session_id = sessions.session_id AND e.event_type = 'click'),
  max_scroll_percent = (SELECT COALESCE(MAX(e.scroll_percent), 0) FROM events e WHERE e.session_id = sessions.session_id)
WHERE session_id IN (SELECT DISTINCT session_id FROM events);

Run this after migration 020 and before or after deploying the new Worker. Old sessions will then show correct counts; new sessions will get counts only from the new path (R2 + throttled session updates).

4. When to drop the events table

  • Do not drop until you are satisfied that:
    • All traffic uses the new tracker (with session_started_at).
    • Replay works for all sessions you care about from R2.
    • You have no remaining reliance on the events table for analytics or backfill.
  • After a suitable retention period (e.g. 90+ days) with no writes to events, you can:
    • Stop running any backfill that reads from events.
    • Optionally drop the table and indexes in a new migration (e.g. 021_drop_events_table.sql) if you want to reclaim space and simplify the schema. Keep the migration reversible (e.g. document only; actual DROP in a later migration once confirmed).

Rollback

  • If you need to roll back the Worker to a version that writes to D1 events again, do not drop the events table until you have permanently moved off it.
  • The new columns on sessions are additive; an old Worker will simply not set them (they remain 0 for new sessions). Replay would again use D1 fallback only if you reverted the replay code path; the refactored code has no D1 events fallback.

How the new design reduces D1 load

  1. No event inserts — Each event previously caused one INSERT into events. Now: zero D1 writes for the event payload (stored only in R2).
  2. No session read for R2 key — Previously each event or batch triggered getSession() to read started_at. Now: client sends session_started_at, so no D1 read during ingest.
  3. Throttled session updates — Previously every event (or batch) updated sessions (last_seen, exit_page). Now: one update per session at most every 20 seconds (or on session_end/navigation), plus aggregate deltas (event_count, page_views, clicks, max_scroll_percent).
  4. Session list — Previously LEFT JOIN events and COUNT(events). Now: single table read using sessions.event_count.
  5. Replay — No fallback to SELECT * FROM events; all event data comes from R2.

Net effect: for a session with 500 events, before you had on the order of 500+ D1 reads/writes (inserts + updates + getSession). After: a small number of throttled session updates (e.g. ~25 if 20s throttle over 500s) and zero event rows; replay and list use R2 and sessions.event_count.