Overview

An HR technology product management team struggled with inconsistent definitions of success metrics across squads. Post?launch reviews stalled while teams reconciled SQL, event names, and time windows in different dashboards. Intelligex implemented a governed metric catalog in dbt, mapped feature exposure from LaunchDarkly to those definitions, and standardized Mode dashboards against the same layer. Reviews referenced the same definitions, analysis time shortened, and follow?up work became clearer—without changing the team’s analytics stack or delivery cadence.

Client Profile

  • Industry: HR technology (talent, payroll, and workforce engagement)
  • Company size (range): Multi?product SaaS with cross?functional squads
  • Stage: Established analytics in a cloud warehouse; dashboards built in Mode; feature flags in LaunchDarkly
  • Department owner: Product Management & R&D
  • Other stakeholders: Data/Analytics, Design/UX, Engineering, Customer Success, RevOps, Security/Privacy, Legal/Compliance, Finance

The Challenge

Each team calculated adoption, activation, conversion, and retention differently. Similar measures appeared under different names, and time windows varied by squad. Feature exposure and holdouts were toggled in LaunchDarkly, yet PMs could not tie those flags to consistent cohorts in Mode. Analysts patched definitions per request, which meant two dashboards could disagree on the same concept. Post?launch reviews became debates about SQL and sampling rather than decisions about what to change.

Data existed but was hard to reuse credibly. Product telemetry flowed to the warehouse, Mode dashboards summarized patterns, and LaunchDarkly assigned exposure. Without a governed metric layer, dashboards embedded bespoke queries, event properties lacked a shared glossary, and segment logic drifted over time. Finance and RevOps requested consistent rollups for executive readouts, while squads used custom variants that fit local needs. The result was duplicated analysis and slow alignment.

Teams wanted clarity without replatforming. The organization preferred to keep Mode as the dashboard surface, LaunchDarkly for flags, and existing warehouse pipelines. The missing piece was a catalog of validated metrics defined in code, linked to exposure, and visible in planning rituals. For the modeling layer, the approach aligned to practices documented in dbt. For feature flag exposure and cohorts, the integration followed patterns in LaunchDarkly.

Why It Was Happening

Root causes were fragmented telemetry and ungoverned definitions. Events and properties evolved by squad, and the same behavior was tagged differently across platforms. Mode reports included metric logic inline, which made definitions easy to fork and hard to compare. Exposure logs from LaunchDarkly were not consistently tied to the warehouse personas and segments used in analysis, so cohort filters drifted from test design.

Ownership was diffuse. Data managed pipelines, Product defined success, Engineering instrumented events, and PMMs assembled readouts. No shared workflow enforced metric definitions, effective dating, or change control before a metric appeared in a dashboard or review. When definitions changed, backfills and documentation lagged, and old dashboards continued to circulate.

The Solution

Intelligex implemented a metric catalog and governance flow anchored in dbt models. Metrics were defined in code with names, business logic, windows, segments, and effective dates. LaunchDarkly exposure and holdouts were mapped to the same identities and segments, so dashboards could attribute outcomes to flag states consistently. Mode dashboards pulled measures from curated views rather than embedding SQL, and a lightweight review process governed new or changed definitions. Human?in?the?loop approval ensured changes made sense before they influenced planning.

  • Integrations: Warehouse models and metric catalog in dbt; feature exposure and cohorts from LaunchDarkly; dashboards in Mode (Mode); planning links in Jira and Confluence for context and decisions.
  • Canonical metric layer: Named metrics with definitions, time windows, filters, segments, and effective dating captured as code and documentation. Included aliases and legacy names for discoverability.
  • Event and identity crosswalks: Glossary mapping event names and properties to metric inputs; identity stitching aligned users, accounts, and segments for LaunchDarkly exposure and analysis.
  • Validation and data quality: Checks for missing properties, unexpected cardinality shifts, and incompatible windowing; alerts when changes might affect published dashboards.
  • Flag?to?metric mapping: Consistent attribution from LaunchDarkly flags and segments to metric cohorts, including holdouts and mutual exclusion groups where relevant.
  • Mode standardization: Parameterized SQL and curated views in Mode to reference cataloged metrics; dashboard templates with uniform filters and guardrails.
  • Governance workflow: Review and approval steps for new metrics and edits; effective dates and change logs; deprecation notices for legacy variants; documented rationale and links to planning artifacts.
  • Security and access: Role?based permissions separated raw from curated data; lineage preserved from dashboard to definition and source tables.

Implementation

  • Discovery: Cataloged the most used metrics and their local variants, LaunchDarkly flag structures and exposure logs, and core Mode dashboards. Collected recurring disputes from post?launch reviews and executive readouts.
  • Design: Authored the metric naming convention, definition template, and effective?dating rules. Defined the event and identity crosswalks, validation checks, and flag?to?metric mapping. Designed Mode templates and filter standards. Agreed on approvers and change control.
  • Build: Implemented metric models and documentation in dbt; created identity and event crosswalk tables; ingested LaunchDarkly exposure to the warehouse; built curated views for Mode; configured Mode templates and filters; wired notifications for validation alerts and definition changes.
  • Testing/QA: Ran in shadow mode by recreating key dashboards from the catalog while keeping existing reports. Reconciled differences, tuned definitions and crosswalks, and validated exposure attribution. Included a human review panel across Product, Analytics, and Engineering.
  • Rollout: Enabled catalog?backed dashboards for selected product areas. Kept legacy dashboards as a controlled fallback temporarily. Expanded coverage as teams adopted templates and definitions in planning.
  • Training/hand?off: Delivered short sessions for PMs, PMMs, and Analysts on the catalog, definitions, and templates. Updated SOPs for post?launch reviews and experiment readouts. Transferred ownership of definitions, crosswalks, and governance to Product Ops and Data under change control.

Results

Post?launch reviews started with a shared set of definitions rather than competing dashboards. PMs and PMMs filtered Mode templates by feature, segment, and exposure, confident that measures matched the catalog. Discussion time shifted from reconciling SQL to interpreting behavior and agreeing on next steps. Analysts focused on deeper questions because routine rollups came from governed views.

Follow?up work was clearer. When outcomes were under target, teams saw the same contribution analysis and exposure context, which helped prioritize design, onboarding, or performance improvements. Executive readouts and squad reviews referenced the same measures and history. Definitions evolved under change control with effective dating, so comparisons remained interpretable as the product changed.

What Changed for the Team

  • Before: Metrics were defined differently by squad. After: A catalog in dbt governed names, logic, and windows with documentation.
  • Before: Dashboards embedded bespoke SQL. After: Mode templates pulled from curated views aligned to cataloged metrics.
  • Before: Feature exposure was hard to attribute. After: LaunchDarkly flags and cohorts mapped cleanly to metric segments and holdouts.
  • Before: Post?launch reviews debated data prep. After: Reviews referenced shared definitions and focused on decisions.
  • Before: Metric edits broke reports silently. After: Validation, change logs, and effective dating kept dashboards trustworthy.
  • Before: Analysts rebuilt rollups for each meeting. After: Routine analysis came from standardized templates, freeing time for deeper work.

Key Takeaways

  • Define metrics as code; a governed catalog stabilizes planning and review rituals.
  • Map flags to cohorts; consistent attribution from LaunchDarkly exposure improves interpretability.
  • Standardize the surface; Mode templates that reference curated views stop definition drift.
  • Validate continuously; checks and alerts catch breaking changes before reviews do.
  • Version deliberately; effective dating and change logs keep comparisons meaningful as definitions evolve.
  • Integrate, don’t replace; keep the warehouse, LaunchDarkly, and Mode and add a governance layer around them.

FAQ

What tools did this integrate with? Metrics and documentation lived in dbt; feature exposure and segments came from LaunchDarkly; dashboards ran in Mode. The warehouse remained the system of record, and planning links in Jira and Confluence pointed to cataloged definitions and dashboards.

How did you handle quality control and governance? Metric definitions followed a template with owners, logic, filters, and effective dates. Validation checks flagged missing properties, unexpected cardinality shifts, and incompatible windows. Changes required review and approval with documented rationale. Deprecations were announced with fallbacks. Lineage connected dashboards to definitions and source models, and access respected role?based permissions.

How did you roll this out without disruption? The catalog ran in shadow mode while legacy dashboards remained. Key reviews compared outcomes and reconciled differences. Catalog?backed templates were introduced by product area, with legacy reports kept as a controlled fallback until teams were comfortable. LaunchDarkly exposure mapping was phased in as cohorts and segments stabilized.

How were feature flags mapped to metrics? LaunchDarkly exposure logs were ingested into the warehouse and joined to stitched identities and segments. The catalog defined which flags and segments applied to each metric and how holdouts were treated. Mode templates included filters that reflected those mappings, so attribution and comparisons were consistent.

How were legacy metrics and dashboards handled? The catalog included aliases and cross?references to legacy names. Deprecation notices pointed users to the new definitions and templates. Where backfills were required, effective dating preserved historical interpretations while aligning future calculations to the catalog.

What about privacy and access to detailed data? PM?facing templates exposed aggregates by default. Access to raw events and identity tables remained limited to data roles. Lineage and documentation clarified what each measure included, and sensitive identifiers were masked in curated views, aligning with existing privacy controls.

You need a similar solution?

Get a FREE
Proof of Concept
& Consultation

No Cost, No Commitment!