Workers Analytics Engine — lead submission analytics

Workers Analytics Engine is used to stream lead submission data from the Worker so you can build dashboards via the SQL API. Each data point includes dimensions and metrics derived from LeadPayload (see src/lead.ts: project, source, referer, engagement, suspicion, session metrics, duplicate/bot flags, UTM) plus form-level fields (CTA, PayloadJson-related). Data is written on every form submit.

Retention: 3 months. For long-term or full payload detail, continue using D1 LeadSubmissions.PayloadJson (see PayloadJson analytics).

Binding and dataset

Binding	Dataset	Written from
`LEAD_ANALYTICS`	`lead_submissions`	`src/lead.ts` → `writeLeadSubmission()` and on D1 persistence failure (so submission attempts are still counted)

The dataset is created automatically on first write. Configured in wrangler.toml under [[analytics_engine_datasets]] (and per env).

Local development: Analytics Engine is not supported in wrangler dev — the LEAD_ANALYTICS binding is undefined locally, so no data is written. Deploy to a real Worker (e.g. wrangler deploy --env development) to verify Analytics Engine writes and query the SQL API. See cloudflare/workers-sdk#4258.

New leads not appearing in lead-stats: If the dashboard shows old data but new submissions do not appear in /lead-stats, the Worker that handles POST /lead.capture may not have the LEAD_ANALYTICS binding. The Worker logs a warning lead.analytics_engine_binding_missing when the binding is undefined. Ensure the same Worker that serves /lead.capture is deployed with wrangler.toml (or your deploy config) containing [[analytics_engine_datasets]] with binding = "LEAD_ANALYTICS" and dataset = "lead_submissions" for that environment. Redeploy with wrangler deploy --env production (or your env) so the binding is attached.

How to see logs when testing a lead: Worker console.log / console.warn output is not visible unless you stream or collect it:

Real-time in terminal (deployed Worker): From the project directory run:
bash
```
npx wrangler tail
```
Leave this running, then submit a test lead to POST /lead.capture. You should see request events and log lines (e.g. [LEAD] LEAD_ANALYTICS binding missing — ... if the binding is missing). Use --env production or --env development to match the deployment you are testing.
Cloudflare Dashboard: Workers & Pages → your Worker (e.g. pulsegate) → Logs or Real-time Logs. Submit a test lead and watch for new entries.
Local (wrangler dev): Logs appear in the terminal where you ran wrangler dev. Note: Analytics Engine is not supported in local dev, so LEAD_ANALYTICS will be undefined and you will see the binding-missing warning; deploy to a real Worker to verify writes.

Schema (data point)

Each submission writes one data point with this layout. Dimensions and metrics align with LeadPayload (src/lead.ts) so analytics can slice by engagement, geo, session, and duplicate/bot.

Column	Type	Description	LeadPayload field
`timestamp`	DateTime	Set by Workers (event time).	—
`_sample_interval`	integer	Sampling weight; use in aggregates.	—
`index1`	string	Sampling key: `project_name`.	`project_name`
`blob1`	string	Project name.	`project_name`
`blob2`	string	Source.	`lead_source`
`blob3`	string	Referer (submission page; max 500 chars).	—
`blob4`	string	CTA name.	(form)
`blob5`	string	`"1"` = primary, `"0"` = follow-up.	—
`blob6`	string	Contact country code.	`country_code`
`blob7`	string	Visitor ID.	`visitor_id`
`blob8`	string	Lead ID (string).	`lead_id`
`blob9`	string	Environment.	`dev_environment`
`blob10`–`14`	string	UTM source, medium, campaign, term, content.	(form)
`blob15`	string	Engagement label (Cold / Warm / Hot / Very Hot).	`engagement_label`
`blob16`	string	User country (geo from CF).	`user_country`
`blob17`	string	`"1"` = duplicate, `"0"` = new.	`is_duplicate`
`blob18`	string	Session ID.	`session_id`
`blob19`	string	`"1"` = bot lead, `"0"` = not.	`is_bot_lead`
`blob20`	string	Event type (Phase 11): `primary` \| `follow_up` \| `test` \| `blocked` \| `bot`. Enables lead-stats breakdown by type without D1.	—
~~`blob21`~~	—	Omitted: Cloudflare Analytics Engine supports max 20 blobs; user city not sent for new data.	—
~~`blob22`~~	—	Omitted: user region not sent for new data.	—
`double1`	double	Always `1` (count).	—
`double2`	double	Suspicion score (0–100).	`suspicion_score`
`double3`	double	Engagement score (0–100).	`engagement_score`
`double4`	double	Time to submit (seconds).	`time_to_submit`
`double5`	double	Pages visited in session.	`pages_visited`
`double6`	double	Session duration (seconds).	`session_duration_seconds`
`double7`	double	Return visits (fingerprint).	`return_visits`
`double8`	double	VPN suspected score.	`vpn_suspected`
`double9`	double	Form submit count (24h).	`form_submit_count`

More form fields (e.g. gclid, fbclid, custom): Stored only in D1 LeadSubmissions.PayloadJson; query via SQLite json_extract(PayloadJson, '$.field_name'). See PayloadJson analytics.

Querying (SQL API)

Use the Analytics Engine SQL API with an API token that has Account → Account Analytics → Read.

Base URL:

http

POST https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql
Authorization: Bearer <API_TOKEN>
Content-Type: application/x-www-form-urlencoded

<query in body>

Example: submissions per day (last 7 days)

sql

SELECT
  toDate(timestamp) AS day,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '7' DAY
GROUP BY day
ORDER BY day DESC

Example: submissions by project (last 30 days)

sql

SELECT
  blob1 AS project_name,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND blob1 != ''
GROUP BY blob1
ORDER BY submissions DESC
LIMIT 50

Example: submissions by source and primary vs follow-up

sql

SELECT
  blob2 AS source,
  blob5 AS is_primary,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
GROUP BY blob2, blob5
ORDER BY submissions DESC

Example: average suspicion and engagement by project

sql

SELECT
  blob1 AS project_name,
  SUM(_sample_interval) AS submissions,
  SUM(_sample_interval * double2) / SUM(_sample_interval) AS avg_suspicion,
  SUM(_sample_interval * double3) / SUM(_sample_interval) AS avg_engagement,
  SUM(_sample_interval * double4) / SUM(_sample_interval) AS avg_time_to_submit_sec
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND blob1 != ''
GROUP BY blob1
ORDER BY submissions DESC

Example: time series (submissions per hour, last 24h)

sql

SELECT
  toStartOfHour(timestamp) AS hour,
  blob1 AS project_name,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '1' DAY
GROUP BY hour, blob1
ORDER BY hour DESC, submissions DESC

Example: submissions by UTM campaign (Google Ads)

sql

SELECT
  blob12 AS utm_campaign,
  blob10 AS utm_source,
  blob11 AS utm_medium,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND (blob12 != '' OR blob10 != '')
GROUP BY blob12, blob10, blob11
ORDER BY submissions DESC
LIMIT 50

Example: by engagement label (LeadPayload)

sql

SELECT
  blob15 AS engagement_label,
  SUM(_sample_interval) AS submissions,
  SUM(_sample_interval * double3) / SUM(_sample_interval) AS avg_engagement_score
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND blob15 != ''
GROUP BY blob15
ORDER BY submissions DESC

Example: by user country and duplicate vs new

sql

SELECT
  blob16 AS user_country,
  blob17 AS is_duplicate,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
GROUP BY blob16, blob17
ORDER BY submissions DESC

Example: by event type (lead-stats breakdown, Phase 11)

sql

SELECT
  blob20 AS event_type,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '7' DAY
  AND blob20 != ''
GROUP BY blob20
ORDER BY submissions DESC

Example: by city (Phase 11)

sql

SELECT
  blob21 AS user_city,
  blob16 AS user_country,
  SUM(_sample_interval) AS submissions
FROM lead_submissions
WHERE timestamp >= NOW() - INTERVAL '30' DAY
  AND (blob21 != '' OR blob16 != '')
GROUP BY blob21, blob16
ORDER BY submissions DESC
LIMIT 50

Building the dashboard

Token: Create an API token with Account Analytics Read in the Cloudflare dashboard.
Backend proxy (recommended): Call the SQL API from your backend (e.g. dashboard API or Worker) so the token is not exposed to the browser. Your backend stores the token (e.g. secret) and forwards queries to https://api.cloudflare.com/client/v4/accounts/{account_id}/analytics_engine/sql.
Grafana: Use the Workers Analytics Engine Grafana plugin and the same token to build dashboards.
Sampling: In aggregates, always weight by _sample_interval (e.g. SUM(_sample_interval) for counts, SUM(_sample_interval * double2) / SUM(_sample_interval) for averages).

Worker proxy and interactive dashboard

The Worker can query the Analytics Engine SQL API on behalf of the dashboard so the API token never leaves the server. Configure:

CLOUDFLARE_ACCOUNT_ID — Your 32-character Cloudflare account ID (e.g. in wrangler.toml vars or as a secret).
CLOUDFLARE_ANALYTICS_ENGINE_API_TOKEN — API token with Account → Account Analytics → Read (set via wrangler secret put CLOUDFLARE_ANALYTICS_ENGINE_API_TOKEN).

When both are set, the following GET endpoints return lead submission analytics (query params: days, project, source, engagement_label):

Endpoint	Description
`/analytics/engine/submissions-daily`	Submissions per day (time series)
`/analytics/engine/by-project`	Submissions by project name
`/analytics/engine/by-source`	Submissions by source
`/analytics/engine/by-engagement`	Submissions by engagement label + avg engagement score
`/analytics/engine/overview`	Total submissions, avg suspicion, avg engagement, avg time to submit
`/analytics/engine/projects`	Distinct projects (for filter dropdowns)
`/analytics/engine/sources`	Distinct sources (for filter dropdowns)
`/analytics/engine/lead-stats`	Lead stats (Phase 11.2): time-bucketed counts by type (primary, follow_up, bot, test, blocked); optional filters and group_by. See Lead stats API below.
`/analytics/engine/lead-stats/summary`	Lead stats summary: last 7 days (or `?days=N`) KPIs: total, by type, actual %, top projects, top countries.

If the token or account ID is missing, these endpoints return 503 with { "error": "Analytics Engine not configured" }. The React dashboard Lead analytics page (/lead-analytics) uses these endpoints to show an interactive dashboard with filters (period, project, source, engagement) and charts (submissions over time, by project, by source, by engagement).

Lead stats API (Phase 11.2)

Reporting timezone: All time buckets (30 min, 60 min, day) are computed in IST (Asia/Kolkata) so marketing can compare day and hour slots in a single timezone.

Query param	Description
`days`	Number of days to query (1–90; default 7).
`bucket`	Time bucket: `30min`, `60min`, or `day` (default `day`).
`project`	Filter by project name (blob1).
`source`	Filter by source (blob2).
`country`	Filter by user country (blob16).
`city`	Filter by user city (blob21).
`group_by`	`none` (time series), `type`, `project`, `source`, `country`, or `city`.

Response (group_by=none): { "buckets": [ { "bucket": "<datetime>", "total", "primary", "follow_up", "test", "blocked", "bot" } ], "bucket": "day", "days", "timezone": "Asia/Kolkata", "filters"?: { ... } }.

Response (group_by=type|project|source|country|city): { "groups": [ { "key", "key_name", "total", "primary", "follow_up", "test", "blocked", "bot" } ], "group_by", "days", "timezone", "filters"?: { ... } }.

Summary response: { "total", "primary", "follow_up", "test", "blocked", "bot", "actual_percent", "top_projects": [ { "project_name", "total" } ], "top_countries": [ { "country", "total" } ] }.

Implementation details

Writer: src/lib/leadAnalyticsEngine.ts → writeLeadSubmissionToAnalyticsEngine(). Called from:
- writeLeadSubmission() in src/lead.ts after each successful D1 insert (primary/follow-up; event_type = primary | follow_up; includes city/region from payload).
- After insertBlockedIPLead for blocked-IP rejects (event_type = blocked).
- After insertTestingLead for test leads (event_type = test).
- After honeypot (bot) handling in handleLead (event_type = bot; written whether or not the lead is stored in BlockedIPLeads).
Optional binding: If LEAD_ANALYTICS is not bound (e.g. in tests), the write is a no-op.
Payload_json: The full request body remains in D1 LeadSubmissions.PayloadJson. Analytics Engine only receives the dimensions above (no full JSON). Use D1 for payload-level analytics and long retention.

Troubleshooting: “No dataset created” / “I don’t see my data”

No dataset list in the Cloudflare UI
Workers Analytics Engine does not show a list of datasets in the dashboard. The dataset is created automatically on first write, and you verify data only via the SQL API (or Grafana). There is no “Analytics Engine” page that lists datasets.
Local dev (wrangler dev) does not write to Cloudflare
Analytics Engine bindings do not send data when you run wrangler dev locally; the binding is not connected to Cloudflare. To get data into the dataset you must either:
- Deploy the Worker: wrangler deploy or wrangler deploy --env production, then submit a lead through the deployed worker URL, or
- Use remote dev: wrangler dev --remote so the worker runs on Cloudflare and the binding can write.
Check that data is written
After deploying, submit a lead, then run a query with the SQL API (API token with Account Analytics Read):
bash
```
curl "https://api.cloudflare.com/client/v4/accounts/YOUR_ACCOUNT_ID/analytics_engine/sql" \
  -H "Authorization: Bearer YOUR_API_TOKEN" \
  -d "SELECT count() FROM lead_submissions"
```
If this returns a count ≥ 1, the dataset exists and data is being written. Use the example queries in this doc for dashboards.

Traffic moderation and quality

A traffic moderation layer uses the same lead_submissions dataset to help marketing teams detect bad traffic, suspicious visitors/campaigns/referers, and block candidates. Endpoints under /analytics/moderation/* and aggregations by source, UTM campaign, referer, country, and visitor are available. See Traffic moderation and quality for:

Moderation queries (high bot rate, high suspicion, repeat submissions, suspicious referers, VPN-heavy, fast submits, high duplicate rate)
Visitor abuse detection and campaign/country risk
Moderation dashboard summary and suggested block candidates (rule: suspicion > 80 AND form_submit_24h > 5 AND vpn > 70)
Traffic quality metrics (bot rate, duplicate rate, real lead rate) and data aggregations

Workers Analytics Engine — lead submission analytics ​

Binding and dataset ​

Schema (data point) ​

Querying (SQL API) ​

Example: submissions per day (last 7 days) ​

Example: submissions by project (last 30 days) ​

Example: submissions by source and primary vs follow-up ​

Example: average suspicion and engagement by project ​

Example: time series (submissions per hour, last 24h) ​

Example: submissions by UTM campaign (Google Ads) ​

Example: by engagement label (LeadPayload) ​

Example: by user country and duplicate vs new ​

Example: by event type (lead-stats breakdown, Phase 11) ​

Example: by city (Phase 11) ​

Building the dashboard ​

Worker proxy and interactive dashboard ​

Lead stats API (Phase 11.2) ​

Implementation details ​

Troubleshooting: “No dataset created” / “I don’t see my data” ​

Traffic moderation and quality ​

See also ​