Skip to content

Architecture

Source Layout

src/
├── index.ts                  # Main entry — request router + scheduled handler
├── types.ts                  # All shared TypeScript interfaces
├── lead.ts                   # Lead capture handler
├── lib/
│   ├── auth.ts               # API key authentication (Bearer + X-API-Key)
│   ├── logger.ts             # Structured JSON logger with request context
│   ├── rateLimiter.ts        # KV-backed fixed-window IP rate limiter
│   ├── validator.ts          # Input validation and XSS sanitization
│   ├── retention.ts          # R2 storage monitoring and retention cleanup
│   └── webhook.ts            # Webhook sender with retry + exponential backoff + dead letter
└── tracking/
    ├── index.ts              # Tracking router (POST /track/*)
    ├── trackerScript.ts      # Frontend JS snippet served at /tracker.js
    ├── routes/
    │   ├── trackSessionStart.ts
    │   ├── trackEvent.ts
    │   ├── trackEventsBatch.ts
    │   ├── trackSessionEnd.ts
    │   └── analytics.ts      # GET /analytics/* dashboard queries
    ├── services/
    │   ├── eventService.ts
    │   ├── sessionService.ts
    │   ├── fingerprintService.ts
    │   ├── identityService.ts
    │   ├── scoringService.ts
    │   ├── storageService.ts
    │   ├── vpnDetectionService.ts
    │   ├── analyticsService.ts      # D1 analytics queries for dashboards
    │   └── leadAnalyticsService.ts  # Lead listing/detail + linked session activity
    └── utils/
        ├── hash.ts
        ├── uuid.ts
        └── response.ts

Request Flow

Client Request

  ├─ /track/*       → rate limiter → tracking/index.ts → validator → route handler → services → D1/KV/R2
  ├─ /analytics/*   → auth.ts (API key) → analytics.ts → analyticsService/leadAnalyticsService → D1 / R2
  ├─ /tracker.js    → inline JS string response (trackerScript.ts)
  ├─ /form.js       → inline JS string response (formScript)
  ├─ /pulse.js      → combined tracker + form; PulseGate.init() defaults endpoint to script origin
  ├─ /health        → JSON health check
  └─ /lead | *      → lead.ts (legacy webhook forwarding to CRM)

Scheduled (cron 0 3 * * *)
  └─ retention.ts   → scans R2 by date prefix → deletes objects older than 90 days

Authentication

Analytics endpoints (/analytics/*) require an API key via one of two headers:

  • Authorization: Bearer <key>
  • X-API-Key: <key>

Unauthenticated requests receive 401 with WWW-Authenticate: Bearer. CORS preflight (OPTIONS) passes without authentication.

The API key is configured as:

  • Development/test: ANALYTICS_API_KEY var in wrangler.toml
  • Production: wrangler secret put ANALYTICS_API_KEY

Cloudflare Bindings

BindingTypePurpose
DBD1Persistent storage for leads + tracking data
TRACKING_KVKVReal-time counters (fingerprints, IPs, sessions)
TRACKING_R2R2Raw session replay data and event logs
ANALYTICS_API_KEYSecretAPI key for analytics endpoint authentication

Storage Strategy

StoreDataAccess Pattern
D1Sessions, events, fingerprints, identity networkPersistent queries
KVRate counters, IP/ASN tracking, session velocityLow-latency reads/writes with 24h TTL
R2Raw event logs per sessionAppend-only writes, prefix list for retrieval

R2 Key Layout

sessions/
  YYYY/MM/DD/
    {session_id}/
      meta.json                          # Session metadata (written once)
      events/
        {timestamp}-{event_id}.json      # One file per event/batch (append-only)

Each event write creates a unique R2 key — no read-modify-write, no race conditions. For session replay, list all keys under the session prefix and merge chronologically.

D1 Schema

Four tables created by migrations/004_tracking_tables.sql:

  • sessions — Visitor sessions with network intelligence (18 columns)
  • events — Granular behavior events: clicks, scrolls, moves, navigation (10 columns)
  • fingerprints — Aggregated fingerprint intelligence for VPN/fraud detection (7 columns)
  • identity_network — Cross-reference graph: fingerprints → IPs → ASNs → countries (6 columns)

Performance indexes cover session lookup by visitor/fingerprint/started_at/ip, event lookup by session/type/timestamp, and identity network lookup by fingerprint/ip/asn.

Background Processing

All heavy I/O (D1 writes, KV counters, R2 storage, VPN scoring) runs inside ctx.waitUntil() so the response returns immediately to the client. Background work is organized in independent try/catch blocks:

  1. Critical — D1 session/fingerprint writes
  2. Identity — Identity network graph
  3. Best-effort — KV counters, VPN scoring, fingerprint updates
  4. Archival — R2 raw event storage

Scheduled Jobs

CronJobDescription
0 3 * * *R2 retention cleanupScans R2 by date prefix, deletes objects older than 90 days