Intent Feed
Capture → ID Match → Enrichment → Segmentation
Purpose
The Intent Feed ingests raw intent signals (cookies, pixels, SDK events, social interactions, commerce activity, and partner feeds), normalizes and matches them to an internal subject ID, enriches with models (interests, risk/fraud, value), and emits persona codes and playbook codes for instant segmentation and activation.
Architecture Overview
Capture: pixels, SDKs, server‑to‑server, partner uploads.
Normalize: schema harmonization, bot/fraud filtering, consent enforcement.
Identity: deterministic/probabilistic stitching →
subject_id(internal ID).Enrich: embeddings, category affinities, price sensitivity, churn/propensity, compliance checks.
Classify: compute persona probabilities → persona codes; map to playbook codes.
Serve: stream to feature store, segments, and decision API; batch sink to warehouse.
Feedback: outcomes loop into model refresh and playbook tuning.
Latency targets: stream p50 < 50ms, p95 < 120ms from capture → enriched record available in the feature store (deployment‑dependent).
Data Sources
Web: first‑party cookies, pageview/search/cart pixels, consent banner states, referrer UTM.
Mobile: app SDK events (screen views, taps, purchases); device class/network only (no raw PII in feed).
Commerce: checkout starts/completes, SKU lines, refunds/returns, subscription changes.
Social & Ads: engagement callbacks (view/click), campaign/line‑item IDs; partner event streams (server‑to‑server).
Partner/Offline: POS batches, loyalty systems, call‑center outcomes, cooperative membership.
Sensitive attributes and special‑category data are not ingested. All sources must carry consent/legitimate‑interest flags; region‑aware enforcement occurs at ingestion.
Identity & Consent
Identifiers Accepted
cookie_id(first‑party),session_idhashed_email(SHA‑256, lower‑cased, trimmed) when consentedapp_instance_id(random app GUID)account_id/customer_id(first‑party)partner_subject_key(for clean‑room mapped IDs)
Stitching
Deterministic precedence: account_id > hashed_email > app_instance_id > cookie_id.
Probabilistic backfill: co‑occurrence + device/geo/time windows, capped with confidence thresholds.
Output:
subject_id(opaque, internal), linkage graph with decay (edge TTL).
Consent & Purpose Limiting
Required fields:
consent.purpose=[analytics|personalization|ads],consent.region,consent.scope.Policy engine hard‑blocks enrichment/activation when consent is missing or revoked.
Event Normalization
Canonical Intent Event
Required:
event_id(ULID),occurred_at(ISO‑8601 UTC),source,event_type,consent,capture_context.Optional:
item(SKU or content),value,currency,metadata(kv),page_context,campaign_context.
Deduplication: idempotency on (event_id), and near‑duplicate suppression via LSH on (subject_id, event_type, item, ±5s).
Bot/Fraud Filtering: signature/heuristics (JA3, header entropy, rapid‑fire patterns), partner IP allowlists, anomaly scores.
Enrichment Models
Interest Embeddings: session‑aware text/item2vec to derive topical vectors and category affinities.
Propensities: conversion, churn, upgrade, repeat purchase, price sensitivity.
Quality Signals: fraud likelihood, bot probability, complaint risk.
Context Features: time‑bucket responsiveness, device/network cohort performance.
Value Signals: short‑term value (STV) and predicted LTV bands.
Outputs are calibrated and written to the Feature Store keyed by subject_id.
Persona & Playbook Classification
Persona Codes: map P(personak∣x)P(\text{persona}_k\mid x) to canonical codes (e.g.,
PER-EXP-dealfirst-1.0).Assignment strategies: top‑1 with threshold, top‑N mixture, or abstain if uncertainty high.
Playbook Codes: deterministic/ML rules translating enriched features + persona to activation strategies, e.g.,
PB-TRIAL-NUDGE,PB-REORDER,PB-COMPARE-GRID,PB-LOYALTY-PERK.Guardrails: fatigue caps, exclusion lists (sensitive categories), fairness constraints.
Stream Output Record (Post‑Match)
{
"record_id": "01JC6Z8B9Q9K4AS3H7X5S3N1PZ",
"subject_id": "SUB-7f91b1c6",
"occurred_at": "2025-09-21T14:05:37Z",
"source": "web_pixel",
"event_type": "add_to_cart",
"item": {"id": "SKU-1234", "category": "brushes/digital"},
"value": 19.99,
"currency": "USD",
"consent": {"region": "US", "purpose": ["analytics","personalization"], "scope": "first_party"},
"enrichments": {
"affinity": {"categories": [{"id": "digital_art", "score": 0.86}]},
"propensity": {"convert_7d": 0.34, "repeat_30d": 0.21},
"value": {"ltv_band": "M"},
"quality": {"bot_prob": 0.01}
},
"persona": {
"codes": ["PER-EXP-dealfirst-1.0"],
"probs": {"PER-EXP-dealfirst-1.0": 0.72, "PER-QLT-craft-1.0": 0.18}
},
"playbooks": ["PB-TRIAL-NUDGE", "PB-COMPARE-GRID"],
"routing": {"eligible": true, "reasons": ["fatigue_ok","policy_pass"]},
"trace_id": "TRC-5b2a"
}Schemas
Intent Event (Capture Layer)
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "IntentEvent",
"type": "object",
"required": ["event_id","occurred_at","source","event_type","consent"],
"additionalProperties": true,
"properties": {
"event_id": {"type": "string"},
"occurred_at": {"type": "string", "format": "date-time"},
"source": {"type": "string", "enum": ["web_pixel","mobile_sdk","server","partner_batch"]},
"event_type": {"type": "string"},
"consent": {
"type": "object",
"required": ["region","purpose","scope"],
"properties": {
"region": {"type": "string"},
"purpose": {"type": "array", "items": {"type": "string"}},
"scope": {"type": "string", "enum": ["first_party","clean_room","partner"]}
}
},
"identifiers": {
"type": "object",
"properties": {
"cookie_id": {"type": "string"},
"session_id": {"type": "string"},
"hashed_email": {"type": "string"},
"app_instance_id": {"type": "string"},
"account_id": {"type": "string"},
"partner_subject_key": {"type": "string"}
}
},
"page_context": {"type": "object"},
"campaign_context": {"type": "object"},
"item": {"type": "object"},
"value": {"type": "number"},
"currency": {"type": "string"},
"metadata": {"type": "object"}
}
}Enriched Intent Record (Post‑Match)
{
"$schema": "https://json-schema.org/draft/2020-12/schema",
"title": "EnrichedIntentRecord",
"type": "object",
"required": ["record_id","subject_id","occurred_at","source","event_type","consent","enrichments"],
"additionalProperties": true,
"properties": {
"record_id": {"type": "string"},
"subject_id": {"type": "string"},
"occurred_at": {"type": "string", "format": "date-time"},
"source": {"type": "string"},
"event_type": {"type": "string"},
"item": {"type": "object"},
"consent": {"type": "object"},
"enrichments": {
"type": "object",
"required": ["affinity","propensity","value","quality"],
"properties": {
"affinity": {"type": "object"},
"propensity": {"type": "object"},
"value": {"type": "object"},
"quality": {"type": "object"}
}
},
"persona": {
"type": "object",
"required": ["codes"],
"properties": {
"codes": {"type": "array", "items": {"type": "string"}},
"probs": {"type": "object"}
}
},
"playbooks": {"type": "array", "items": {"type": "string"}},
"routing": {"type": "object"},
"trace_id": {"type": "string"}
}
}Playbooks
a Definition: executable strategies with guardrails (eligibility, exclusions, exposure caps) and message/offer templates.
Examples:
PB-TRIAL-NUDGE— trigger a low‑friction trial prompt with comparison grid.PB-REORDER— reorder reminder with last‑purchase context.PB-LOYALTY-PERK— perk/early access for loyal cohorts.
Playbooks are selected via rules or a policy model that consumes persona probabilities + enrichments.
Serving & Interfaces
Streaming:
POST /v1/events(capture),GET /v1/stream/subjects/{id}(tail enriched records).Batch: daily S3/Blob exports (Parquet) partitioned by
dt=YYYY‑MM‑DD/region.Segments:
POST /v1/segments/querysupports filters such aspersona.codes ANY IN [...]andplaybooks ANY IN [...].Decisioning:
POST /v1/decideaccepts current context and returns ranked actions + playbook recommendations.
Quality, Safety & Governance
Compliance: consent gating, purpose limiting, DSAR/erasure; clean‑room join paths for partner data.
Bias/Fairness: exclude protected categories; monitor lift and error parity across non‑sensitive cohorts.
Security: PII vaulting; transport encryption; signed pixel/SDK payloads; replay protection.
Observability: dedupe rate, bot rate, stitch confidence, enrichment latency, model drift.
Retention & TTLs
Raw capture: 30–90 days (configurable) with access controls.
Enriched features: 6–24 months (downsampled); linkage edges decay with inactivity.
Consent logs: retained per regulatory requirements.
Error Handling & Edge Cases
Unmatched events →
subject_id = null→ queued for backfill (bounded retry) or aggregated at the cohort level.Low consent scope → enrichment/playbooks omitted; analytics only.
High bot/fraud score → drop or quarantine with
routing.eligible=false.
Versioning & Change Management
Schema evolution via
MAJOR.MINOR.PATCH; producers/consumers validated in CI.Backward‑compatible changes favored; deprecations announced with migration guidance.
Limitations & Assumptions
Browser privacy controls (ITP/ETP) limit cookie longevity; first‑party capture and server events are recommended.
Identity stitching quality depends on consented identifiers and traffic mix.
Playbook efficacy is context‑dependent; continuous testing is required.
Last updated

