Skip to content

Workers Analytics Engine — lead submission analytics

Workers Analytics Engine is used to stream lead submission data from the Worker so you can build dashboards via the SQL API. Each data point includes dimensions and metrics derived from LeadPayload (see src/lead.ts: project, source, referer, engagement, suspicion, session metrics, duplicate/bot flags, UTM) plus form-level fields (CTA, PayloadJson-related). Data is written on every form submit.

Retention: 3 months. For long-term or full payload detail, continue using D1 LeadSubmissions.PayloadJson (see PayloadJson analytics).

Binding and dataset

BindingDatasetWritten from
LEAD_ANALYTICSlead_submissionssrc/lead.tswriteLeadSubmission() and on D1 persistence failure (so submission attempts are still counted)

The dataset is created automatically on first write. Configured in wrangler.toml under [[analytics_engine_datasets]] (and per env).

Local development: Analytics Engine is not supported in wrangler dev — the LEAD_ANALYTICS binding is undefined locally, so no data is written. Deploy to a real Worker (e.g. wrangler deploy --env development) to verify Analytics Engine writes and query the SQL API. See cloudflare/workers-sdk#4258.

New leads not appearing in lead-stats: If the dashboard shows old data but new submissions do not appear in /lead-stats, the Worker that handles POST /lead.capture may not have the LEAD_ANALYTICS binding. The Worker logs a warning lead.analytics_engine_binding_missing when the binding is undefined. Ensure the same Worker that serves /lead.capture is deployed with wrangler.toml (or your deploy config) containing [[analytics_engine_datasets]] with binding = "LEAD_ANALYTICS" and dataset = "lead_submissions" for that environment. Redeploy with wrangler deploy --env production (or your env) so the binding is attached.

How to see logs when testing a lead: Worker console.log / console.warn output is not visible unless you stream or collect it:

  1. Real-time in terminal (deployed Worker): From the project directory run:

    bash
    npx wrangler tail

    Leave this running, then submit a test lead to POST /lead.capture. You should see request events and log lines (e.g. [LEAD] LEAD_ANALYTICS binding missing — ... if the binding is missing). Use --env production or --env development to match the deployment you are testing.

  2. Cloudflare Dashboard: Workers & Pages → your Worker (e.g. pulsegate) → Logs or Real-time Logs. Submit a test lead and watch for new entries.

  3. Local (wrangler dev): Logs appear in the terminal where you ran wrangler dev. Note: Analytics Engine is not supported in local dev, so LEAD_ANALYTICS will be undefined and you will see the binding-missing warning; deploy to a real Worker to verify writes.

Schema (data point)

Each submission writes one data point with this layout. Dimensions and metrics align with LeadPayload (src/lead.ts) so analytics can slice by engagement, geo, session, and duplicate/bot.

ColumnTypeDescriptionLeadPayload field
timestampDateTimeSet by Workers (event time).
_sample_intervalintegerSampling weight; use in aggregates.
index1stringSampling key: project_name.project_name
blob1stringProject name.project_name
blob2stringSource.lead_source
blob3stringReferer (submission page; max 500 chars).
blob4stringCTA name.(form)
blob5string"1" = primary, "0" = follow-up.
blob6stringContact country code.country_code
blob7stringVisitor ID.visitor_id
blob8stringLead ID (string).lead_id
blob9stringEnvironment.dev_environment
blob1014stringUTM source, medium, campaign, term, content.(form)
blob15stringEngagement label (Cold / Warm / Hot / Very Hot).engagement_label
blob16stringUser country (geo from CF).user_country
blob17string"1" = duplicate, "0" = new.is_duplicate
blob18stringSession ID.session_id
blob19string"1" = bot lead, "0" = not.is_bot_lead
blob20stringEvent type (Phase 11): primary | follow_up | test | blocked | bot. Enables lead-stats breakdown by type without D1.
blob21Omitted: Cloudflare Analytics Engine supports max 20 blobs; user city not sent for new data.
blob22Omitted: user region not sent for new data.
double1doubleAlways 1 (count).
double2doubleSuspicion score (0–100).suspicion_score
double3doubleEngagement score (0–100).engagement_score
double4doubleTime to submit (seconds).time_to_submit
double5doublePages visited in session.pages_visited
double6doubleSession duration (seconds).session_duration_seconds
double7doubleReturn visits (fingerprint).return_visits
double8doubleVPN suspected score.vpn_suspected
double9doubleForm submit count (24h).form_submit_count

More form fields (e.g. gclid, fbclid, custom): Stored only in D1 LeadSubmissions.PayloadJson; query via SQLite json_extract(PayloadJson, '$.field_name'). See PayloadJson analytics.

Querying (SQL API)

Use the Analytics Engine SQL API with an API token that has Account → Account Analytics → Read.

Base URL:

http
POST https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql
Authorization: Bearer <API_TOKEN>
Content-Type: application/x-www-form-urlencoded

<query in body>

Example: submissions per day (last 7 days)

sql
SELECT
  toDate(timestamp) AS day,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '7' DAY
GROUP BY day
ORDER BY day DESC

Example: submissions by project (last 30 days)

sql
SELECT
  blob1 AS project_name,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND blob1 != ''
GROUP BY blob1
ORDER BY submissions DESC
LIMIT 50

Example: submissions by source and primary vs follow-up

sql
SELECT
  blob2 AS source,
  blob5 AS is_primary,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
GROUP BY blob2, blob5
ORDER BY submissions DESC

Example: average suspicion and engagement by project

sql
SELECT
  blob1 AS project_name,
  SUM(_sample_interval) AS submissions,
  SUM(_sample_interval * double2) / SUM(_sample_interval) AS avg_suspicion,
  SUM(_sample_interval * double3) / SUM(_sample_interval) AS avg_engagement,
  SUM(_sample_interval * double4) / SUM(_sample_interval) AS avg_time_to_submit_sec
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND blob1 != ''
GROUP BY blob1
ORDER BY submissions DESC

Example: time series (submissions per hour, last 24h)

sql
SELECT
  toStartOfHour(timestamp) AS hour,
  blob1 AS project_name,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '1' DAY
GROUP BY hour, blob1
ORDER BY hour DESC, submissions DESC

Example: submissions by UTM campaign (Google Ads)

sql
SELECT
  blob12 AS utm_campaign,
  blob10 AS utm_source,
  blob11 AS utm_medium,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND (blob12 != '' OR blob10 != '')
GROUP BY blob12, blob10, blob11
ORDER BY submissions DESC
LIMIT 50

Example: by engagement label (LeadPayload)

sql
SELECT
  blob15 AS engagement_label,
  SUM(_sample_interval) AS submissions,
  SUM(_sample_interval * double3) / SUM(_sample_interval) AS avg_engagement_score
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND blob15 != ''
GROUP BY blob15
ORDER BY submissions DESC

Example: by user country and duplicate vs new

sql
SELECT
  blob16 AS user_country,
  blob17 AS is_duplicate,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
GROUP BY blob16, blob17
ORDER BY submissions DESC

Example: by event type (lead-stats breakdown, Phase 11)

sql
SELECT
  blob20 AS event_type,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '7' DAY
  AND blob20 != ''
GROUP BY blob20
ORDER BY submissions DESC

Example: by city (Phase 11)

sql
SELECT
  blob21 AS user_city,
  blob16 AS user_country,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND (blob21 != '' OR blob16 != '')
GROUP BY blob21, blob16
ORDER BY submissions DESC
LIMIT 50

Building the dashboard

  1. Token: Create an API token with Account Analytics Read in the Cloudflare dashboard.
  2. Backend proxy (recommended): Call the SQL API from your backend (e.g. dashboard API or Worker) so the token is not exposed to the browser. Your backend stores the token (e.g. secret) and forwards queries to https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql.
  3. Grafana: Use the Workers Analytics Engine Grafana plugin and the same token to build dashboards.
  4. Sampling: In aggregates, always weight by _sample_interval (e.g. SUM(_sample_interval) for counts, SUM(_sample_interval * double2) / SUM(_sample_interval) for averages).

Worker proxy and interactive dashboard

The Worker can query the Analytics Engine SQL API on behalf of the dashboard so the API token never leaves the server. Configure:

  • CLOUDFLARE_ACCOUNT_ID — Your 32-character Cloudflare account ID (e.g. in wrangler.toml vars or as a secret).
  • CLOUDFLARE_ANALYTICS_ENGINE_API_TOKEN — API token with Account → Account Analytics → Read (set via wrangler secret put CLOUDFLARE_ANALYTICS_ENGINE_API_TOKEN).

When both are set, the following GET endpoints return lead submission analytics (query params: days, project, source, engagement_label):

EndpointDescription
/analytics/engine/submissions-dailySubmissions per day (time series)
/analytics/engine/by-projectSubmissions by project name
/analytics/engine/by-sourceSubmissions by source
/analytics/engine/by-engagementSubmissions by engagement label + avg engagement score
/analytics/engine/overviewTotal submissions, avg suspicion, avg engagement, avg time to submit
/analytics/engine/projectsDistinct projects (for filter dropdowns)
/analytics/engine/sourcesDistinct sources (for filter dropdowns)
/analytics/engine/lead-statsLead stats (Phase 11.2): time-bucketed counts by type (primary, follow_up, bot, test, blocked); optional filters and group_by. See Lead stats API below.
/analytics/engine/lead-stats/summaryLead stats summary: last 7 days (or ?days=N) KPIs: total, by type, actual %, top projects, top countries.

If the token or account ID is missing, these endpoints return 503 with { "error": "Analytics Engine not configured" }. The React dashboard Lead analytics page (/lead-analytics) uses these endpoints to show an interactive dashboard with filters (period, project, source, engagement) and charts (submissions over time, by project, by source, by engagement).

Lead stats API (Phase 11.2)

Reporting timezone: All time buckets (30 min, 60 min, day) are computed in IST (Asia/Kolkata) so marketing can compare day and hour slots in a single timezone.

Query paramDescription
daysNumber of days to query (1–90; default 7).
bucketTime bucket: 30min, 60min, or day (default day).
projectFilter by project name (blob1).
sourceFilter by source (blob2).
countryFilter by user country (blob16).
cityFilter by user city (blob21).
group_bynone (time series), type, project, source, country, or city.

Response (group_by=none): { "buckets": [ { "bucket": "<datetime>", "total", "primary", "follow_up", "test", "blocked", "bot" } ], "bucket": "day", "days", "timezone": "Asia/Kolkata", "filters"?: { ... } }.

Response (group_by=type|project|source|country|city): { "groups": [ { "key", "key_name", "total", "primary", "follow_up", "test", "blocked", "bot" } ], "group_by", "days", "timezone", "filters"?: { ... } }.

Summary response: { "total", "primary", "follow_up", "test", "blocked", "bot", "actual_percent", "top_projects": [ { "project_name", "total" } ], "top_countries": [ { "country", "total" } ] }.

Implementation details

  • Writer: src/lib/leadAnalyticsEngine.tswriteLeadSubmissionToAnalyticsEngine(). Called from:
    • writeLeadSubmission() in src/lead.ts after each successful D1 insert (primary/follow-up; event_type = primary | follow_up; includes city/region from payload).
    • After insertBlockedIPLead for blocked-IP rejects (event_type = blocked).
    • After insertTestingLead for test leads (event_type = test).
    • After honeypot (bot) handling in handleLead (event_type = bot; written whether or not the lead is stored in BlockedIPLeads).
  • Optional binding: If LEAD_ANALYTICS is not bound (e.g. in tests), the write is a no-op.
  • Payload_json: The full request body remains in D1 LeadSubmissions.PayloadJson. Analytics Engine only receives the dimensions above (no full JSON). Use D1 for payload-level analytics and long retention.

Troubleshooting: “No dataset created” / “I don’t see my data”

  1. No dataset list in the Cloudflare UI
    Workers Analytics Engine does not show a list of datasets in the dashboard. The dataset is created automatically on first write, and you verify data only via the SQL API (or Grafana). There is no “Analytics Engine” page that lists datasets.

  2. Local dev (wrangler dev) does not write to Cloudflare
    Analytics Engine bindings do not send data when you run wrangler dev locally; the binding is not connected to Cloudflare. To get data into the dataset you must either:

    • Deploy the Worker: wrangler deploy or wrangler deploy --env production, then submit a lead through the deployed worker URL, or
    • Use remote dev: wrangler dev --remote so the worker runs on Cloudflare and the binding can write.
  3. Check that data is written
    After deploying, submit a lead, then run a query with the SQL API (API token with Account Analytics Read):

    bash
    curl "https://api.cloudflare.com/client/v4/accounts/YOUR_ACCOUNT_ID/analytics_engine/sql" \
      -H "Authorization: Bearer YOUR_API_TOKEN" \
      -d "SELECT count() FROM lead_submissions"

    If this returns a count ≥ 1, the dataset exists and data is being written. Use the example queries in this doc for dashboards.

Traffic moderation and quality

A traffic moderation layer uses the same lead_submissions dataset to help marketing teams detect bad traffic, suspicious visitors/campaigns/referers, and block candidates. Endpoints under /analytics/moderation/* and aggregations by source, UTM campaign, referer, country, and visitor are available. See Traffic moderation and quality for:

  • Moderation queries (high bot rate, high suspicion, repeat submissions, suspicious referers, VPN-heavy, fast submits, high duplicate rate)
  • Visitor abuse detection and campaign/country risk
  • Moderation dashboard summary and suggested block candidates (rule: suspicion > 80 AND form_submit_24h > 5 AND vpn > 70)
  • Traffic quality metrics (bot rate, duplicate rate, real lead rate) and data aggregations

See also