Overview

An e?learning platform’s lead scoring fell apart because signals from product trials, webinars, and content engagement arrived late and incomplete. Marketing worked from out?of?date scores, SDRs chased poor fits, and duplicates masked real interest. We established a streaming pipeline with Fivetran into Snowflake, stitched identities in Segment to unify trial and marketing profiles, and rebuilt the score in dbt with documented logic and data quality checks. Scores were pushed back to the CRM and marketing automation, with monitoring and alerts. Marketers trusted the rankings, SDR lists improved, and rework on weak opportunities declined.

Client Profile

  • Industry: E?learning platform (B2B SaaS)
  • Company size (range): Growth?stage company with enterprise and self?serve motions
  • Stage: Scaling product?led growth and demand generation
  • Department owner: Marketing & Customer Engagement (Growth Marketing / Marketing Operations)
  • Other stakeholders: Sales Operations, SDR/BDR, RevOps, Product Analytics, Data Engineering, Customer Success, Legal & Privacy, IT/Integrations

The Challenge

Scoring relied on knowing who engaged and how recently. Trial activation and feature use lived in product analytics, webinar attendance came from the webinar platform, and content engagement came from the site and email tools. These signals were batched and landed on different cadences. The weekly scoring job rolled up incomplete data, and the CRM received stale scores. SDRs called into accounts that looked active but hadn’t touched the product in days, while real?time interest from webinars or high?intent content arrived after queues were built.

Identity was inconsistent. Trials used product user IDs and work emails that didn’t always match marketing records, webinar registrations included personal emails, and content engagement carried anonymous IDs that weren’t resolved at the time of the score. Duplicates proliferated when trial sign?ups and content forms both created records. Definitions drifted between spreadsheets, the CRM, and the warehouse, and there were no automated checks to flag missing fields or out?of?date feeds before scores were published.

Why It Was Happening

Batch imports, fragmented identifiers, and manual rules were the norm. Product events were appended in bulk, webinar lists were uploaded on a schedule, and content engagement arrived through exports. Scoring logic lived in workbook tabs with informal weights that changed over time. Without a unified identity graph and a governed, tested model, lead priority reflected what had finally arrived rather than what mattered now. Ownership for definitions and data quality came after the fact, when lists underperformed.

Governance was thin. No freshness or completeness checks guarded the pipeline, and there was no single description of what a score meant. SDRs lacked context on which factors drove the ranking, so feedback didn’t improve the model. Duplicate detection happened by exception, and model runs published even when key sources were missing.

The Solution

We implemented a governed scoring pipeline that ingested engagement signals continuously, unified identities, and calculated scores in code with tests and documentation. Fivetran moved data into Snowflake on a steady cadence; Segment resolved identities across trials, webinars, and marketing; dbt computed a transparent score with data quality checks; and the result synced back to the CRM and marketing automation. Monitoring and Slack alerts surfaced freshness issues before lists reached SDRs. Core tools stayed in place; the new layer standardized timing, identity, and definitions around them.

  • Streaming and incremental ingestion using Fivetran for product, webinar, and marketing sources into Snowflake
  • Identity stitching and trait unification in Segment to link trial users, webinar registrants, and marketing profiles
  • Scoring logic and transformations encoded in dbt with tests for freshness, completeness, accepted values, and duplicate suppression
  • Publishing scores and drivers to the CRM and marketing automation via existing connectors; traits and fields visible to reps and campaigns
  • Operational monitoring and score breakdowns surfaced in Looker with drill?through to evidence
  • SLA alerts to Slack for stale feeds, missing webinar lists, or identity stitching failures
  • Data dictionary and change control for score components, owned by Marketing Ops and RevOps
  • Human?in?the?loop review for low?confidence matches and edge?case duplicates before publishing assignments

Implementation

  • Discovery: Mapped every input to the current score: trial activation and feature events, webinar registrations and attendance, email and site engagement, and CRM fields. Cataloged identifiers and where they diverged, inventoried duplicates, and gathered SDR feedback on false positives and missed opportunities.
  • Design: Defined the score blueprint: factor categories (product intent, event intent, content intent, fit), decay rules, exclusions, and suppression for duplicates. Authored the identity stitching strategy in Segment, field mappings to the CRM and marketing automation, and the data quality checks and alert thresholds. Planned Looker pages for score breakdown and source freshness.
  • Build: Stood up Fivetran connectors for product, webinar, and marketing sources into Snowflake; configured Segment identity resolution and trait enrichment; implemented dbt models for staging, stitching, scoring, and tests; synced scores and driver fields back to the CRM and marketing automation; and added monitoring dashboards and Slack alerts.
  • Testing and QA: Back?tested historical cohorts to compare new scores to outcomes and SDR feedback; validated identity stitching on tricky cases (personal vs. work email, multiple trial users per account); executed data quality failures to confirm alerts; and verified score visibility and breakdowns in rep and marketer views.
  • Rollout: Ran the new model in read?only alongside the legacy score, sharing comparison lists with SDRs for several cycles. After alignment, switched routing and campaigns to the governed score. Kept a manual exception path for high?priority handoffs with post?publish review.
  • Training and hand?off: Delivered guides for SDRs on score meaning and factors, for Marketing Ops on definitions and change control, and for RevOps on monitoring and troubleshooting. Established a cadence for reviewing factor weights, decay, and new sources.
  • Human?in?the?loop review: Routed low?confidence identity matches and duplicate conflicts to an approver queue. Decisions were stored with rationale and fed back into stitching and dedupe rules.

Results

Lead and account scores reflected real engagement, not last week’s exports. SDRs opened records with clear drivers—trial activity, webinar attendance, or content signals—and could prioritize with confidence. Duplicates were suppressed at the source, and identity stitching linked trial users and marketing profiles, so outreach landed on the right contact with account context.

Operationally, marketing built campaigns on consistent definitions, and score breakdowns matched what the data showed. When a feed lagged, Slack alerts fired before lists were sent. Data quality tests caught stale webinars and missing fields, and the team paused or adjusted without guesswork. Rework on poor fits tapered as routing aligned to governed intent signals and a transparent model.

What Changed for the Team

  • Before: Weekly scoring ran on incomplete exports. After: Fivetran streamed signals into Snowflake and dbt rebuilt scores continuously with tests.
  • Before: Trial, webinar, and content identities didn’t match. After: Segment stitched profiles so scores reflected the whole picture.
  • Before: Weights and rules lived in spreadsheets. After: Scoring logic was encoded and documented in dbt with versioned changes.
  • Before: SDRs lacked context behind a score. After: Score drivers and last?seen signals were visible in the CRM and dashboards.
  • Before: Data issues surfaced after a bad send. After: Looker monitoring and Slack alerts flagged freshness gaps before lists were built.

Key Takeaways

  • Unify signals and identities first; scoring improves when trials, webinars, and content engagement describe the same person or account.
  • Put scoring logic in code with tests; definitions and decay rules should be versioned, monitored, and easy to explain.
  • Instrument freshness and completeness; alerts prevent publishing scores when inputs lag or fields go missing.
  • Publish score drivers, not just a rank; SDRs move faster when they see why a record is hot.
  • Keep your stack—Fivetran, Segment, Snowflake, CRM/MAP—and add governance, monitoring, and a transparent model around them.

FAQ

What tools did this integrate with?
Signals were ingested by Fivetran into Snowflake, identities were stitched and traits unified in Segment, scoring and tests ran in dbt, monitoring and breakdowns were delivered in Looker, and alerts posted to Slack. Scores and drivers synced back to the CRM and marketing automation via existing connectors.

How did you handle quality control and governance?
We encoded scoring logic and decay rules in dbt with tests for source freshness, completeness, accepted values, and duplicate suppression. Segment managed identity resolution with documented mappings and conflict handling. A data dictionary captured definitions and ownership, and change control governed updates to factors and thresholds. Monitoring dashboards and Slack alerts flagged issues before scores published.

How did you roll this out without disruption?
The new model ran in parallel with the legacy score. We compared lists, gathered SDR feedback, and tuned stitching and factors. After alignment, routing and campaigns switched to the governed score. A manual exception path remained for high?priority deals, with post?publish review to fold learnings back into the model.

How were scoring factors chosen and maintained?
Factors grouped into product intent (trial activation and feature use), event intent (webinar registration and attendance), content intent (email and site behavior), and fit (firmographic traits). Weights and decay rules were set with Marketing Ops and Sales Ops, documented in the data dictionary, and reviewed on a regular cadence. Changes were versioned in dbt with notes and regression checks.

How did you handle duplicates and identity conflicts?
Segment stitched identities using deterministic signals (email, login, account), with conservative fallbacks for ambiguous cases. dbt models applied duplicate suppression before scoring, and low?confidence matches routed to a human reviewer. Decisions updated stitching rules or allowlists, which reduced repeat exceptions.

You need a similar solution?

Get a FREE
Proof of Concept
& Consultation

No Cost, No Commitment!