Intelligex | Protected Health Data Reports To De-Identified Analytics

Note: This example reflects the types of solutions Intelligex can deliver. Actual engagements are tailored to each client’s goals, systems, constraints, timeline, and resources, so implementation details and outcomes may vary.

Overview

Telemedicine expansion planning at a healthcare system stalled because reports were compiled manually and contained protected health information (PHI) that could not be shared broadly. Analysts redacted spreadsheets by hand, definitions varied by service line, and executive reviews slowed under privacy constraints. Intelligex implemented a de-identified analytics pipeline on Google Cloud with Data Loss Prevention, delivered cohort-level summaries in Looker, and gated access with Okta-backed roles. Strategy received safe, aggregated insights without ad hoc redactions, and reviews moved faster because every view drew from the same governed dataset and policy controls.

Client Profile

Industry: Integrated healthcare system (ambulatory, acute, and virtual care)
Company size (range): Multi-hospital, multi-clinic network
Stage: Scaling telemedicine capacity across regions and specialties
Department owner: Strategy, Analytics & Executive Leadership (Enterprise Strategy / Population Health)
Other stakeholders: Clinical Operations, Compliance & Privacy, Information Security, IT/Data Engineering, Finance, Service Line Leadership, Access & Scheduling

The Challenge

Planning required a single view of demand, capacity, patient access, and outcomes across specialties and regions. In practice, extracts came from the electronic health record, scheduling, claims, and contact center systems, then landed in spreadsheets with patient identifiers and free-text notes. Analysts manually redacted and aggregated data before sharing, and caveats followed every deck because definitions and redaction practices were inconsistent. Questions about modality mix, no-show risk, and network leakage were answered with differing methods by service line, complicating capacity and investment choices.

Compliance and security teams were cautious, and rightly so. Reports circulated via email, access controls varied by team, and there was no standardized approach to de-identification or audience-specific aggregation. Leaders did not want to replatform clinical systems or disrupt current BI tools; they needed a way to generate safe, policy-aligned summaries for strategic planning while preserving the ability for authorized reviewers to inspect underlying signals when necessary.

Why It Was Happening

PHI flowed into planning artifacts because upstream pipelines lacked de-identification and access tiering. Different teams defined cohorts differently and used separate cut-off dates. Manual redaction created version sprawl and invited mistakes, and there was no durable linkage from a planning chart back to the policy that governed its creation. Without a shared data contract for telemedicine metrics, debates over method overshadowed decisions.

Governance sat at the end of the process. Privacy reviews happened on final slides, not at ingestion. Access was granted via file shares instead of role-based policies. When exceptions arosesuch as the need to drill into a subset for quality reviewthere was no structured human-in-the-loop path, so analysts rebuilt extracts, increasing risk and workload.

The Solution

We built a de-identified analytics pipeline on Google Cloud Platform (GCP) that ingests clinical and operational extracts, classifies and transforms sensitive fields using Google Cloud Data Loss Prevention (DLP), and publishes cohort-level models in BigQuery for use in Looker. Policies enforced aggregation thresholds and suppression rules aligned to HIPAA de-identification guidance, with a human-in-the-loop review for edge cases. Okta-backed roles controlled who could see which explores, and all drill-throughs remained at safe levels for planning use. Existing systems stayed in place; the new layer orchestrated de-identification, aggregation, and access controls around them.

Ingestion of clinical, scheduling, and contact center extracts into secure landing zones; curated models in BigQuery
Classification and de-identification using Google Cloud Data Loss Prevention (tokenization, masking, bucketing, and suppression thresholds)
Aggregation logic aligned to HIPAA de-identification methods and internal policy, referencing HHS guidance on de-identification (HHS HIPAA: De-identification)
Encryption and key management for sensitive staging areas using Cloud Key Management Service
Cohort definitions and metric logic encoded in data transformations (visit modality, time-to-appointment, no-show propensity, leakage indicators)
Certified Looker explores and dashboards with aggregate views and row/column-level policies (Looker)
Okta-backed group mapping to analytical roles for planners, clinicians, and compliance reviewers (Okta Groups)
Audit trail for de-identification actions, aggregation thresholds, and dashboard access
Human-in-the-loop privacy review for new fields, small cohorts, or exception drill requests
Data catalog tags and lineage to track metric definitions and refresh windows

Implementation

Discovery: Mapped source systems, extract formats, and cadence; identified PHI/PII fields and common free-text patterns; cataloged current telemedicine metrics; and reviewed privacy policies and prior audit findings with Compliance and Security.
Design: Defined cohort and metric schemas with shared calendars; authored DLP profiles, transformation rules, and suppression thresholds for small cell sizes; designed role tiers and access policies; and outlined Looker explores for Strategy, service lines, and compliance reviewers.
Build: Stood up secure GCP landing zones and BigQuery datasets; implemented DLP inspection and de-identification templates; encoded cohort and metric logic in transformations; configured Looker explores and access filters; and integrated Okta groups to analytics roles.
Testing and QA: Ran historical data through the pipeline, validated de-identification and aggregation against policy, and performed re-identification risk checks on small cohorts. Reconciled metric outputs with legacy reports and tuned suppression rules to reduce noise while preserving utility.
Rollout: Launched read-only aggregate dashboards to Strategy and service line leaders while legacy processes continued. After validation, made certified Looker views the source for planning, with an exception path for privacy-reviewed drill-ins.
Training and hand-off: Delivered short guides for analysts on cohort definitions and safe filters, for Strategy on reading aggregates, and for Compliance on audit logs and exception review. Assigned stewardship for metric definitions, DLP profiles, and role mapping with a review cadence.

Results

Planning moved from redacting spreadsheets to working from certified aggregate dashboards. Strategy, service lines, and Finance reviewed the same cohort definitions for access, demand, and utilization. Privacy concerns no longer stalled discussions because PHI was removed upstream and policies were enforced consistently. When a deeper look was warranted, a quick, documented review allowed a controlled drill without rebuilding extracts.

Leaders made decisions with clearer evidence: where to expand telemedicine slots, which specialties required hybrid staffing, and how access metrics changed by region and modality. Fewer handoffs and less email sprawl reduced rework, and Compliance had confidence that analytics stayed within policy thanks to audit trails and access controls.

What Changed for the Team

Before: Manual redactions in spreadsheets delayed sharing. After: De-identified, aggregated views were available in Looker by role.
Before: Cohort definitions varied by service line. After: Metrics and cohorts were encoded centrally and applied consistently.
Before: Privacy reviews happened on finished decks. After: DLP and suppression rules operated at ingestion with audit trails.
Before: Access was managed via ad hoc file shares. After: Okta groups governed access tiers with tracked permissions.
Before: Exceptions required bespoke extracts. After: A human-in-the-loop path enabled controlled drill-ins without exposing PHI.

Key Takeaways

De-identify at the source and enforce aggregation thresholds; privacy should be built into the pipeline, not applied to slides.
Standardize cohort definitions and calendars so service lines compare results on like terms.
Use role-based access via identity groups to tailor safe views for Strategy, clinical leaders, and compliance reviewers.
Keep clinical systems and BI tools; layer DLP, aggregation, and governance to make existing assets safe for planning.
Maintain auditability for de-identification actions, metric changes, and accessgovernance earns trust and speeds reviews.

FAQ

What tools did this integrate with?
We used Google Cloud Data Loss Prevention for PHI classification and de-identification, BigQuery for curated models, and Looker for certified, aggregate dashboards. Access was governed by Okta groups mapped to analytics roles (Okta Groups). Clinical systems and scheduling sources remained unchanged; extracts flowed into the pipeline on a set cadence.

How did you handle quality control and governance?
We defined cohort and metric logic centrally and encoded them in transformations. DLP policies masked, tokenized, or suppressed sensitive fields, with small-cell suppression to prevent inadvertent re-identification. A human-in-the-loop review handled exceptions and new fields. Audit logs captured de-identification actions, dataset versions, and dashboard access, and policies aligned to HIPAA de-identification guidance.

How did you roll this out without disruption?
We ran dashboards in read-only mode alongside existing spreadsheets and validated that aggregates matched legacy outputs where policy allowed. After stakeholders were comfortable, certified Looker views became the planning source. Teams kept their tools; the pipeline added safe de-identification, aggregation, and access controls around them.

How did you ensure HIPAA compliance and safe sharing?
DLP inspection detected PHI and applied de-identification templates. Aggregation thresholds and suppression rules enforced safe cohort sizes, and identifiable fields never left secure staging zones. Access to dashboards was restricted by Okta-backed roles, and drill-throughs remained within approved aggregation levels. Compliance reviewed and approved policies and monitored audit logs.

How were roles and permissions managed?
Okta groups mapped to analytics roles (Strategy, service line leadership, compliance reviewer). Looker enforced row and column policies tied to these groups, and BigQuery datasets used role-based permissions. Requests for elevated access followed a documented workflow with Compliance approval, and all grants were time-bounded and logged.

Department/Function: Analytics & Executive Leadership IT & Infrastructure Legal & Compliance Strategy

Capability: AI Security Privacy & Governance

Get a FREE
Proof of Concept
& Consultation

No Cost, No Commitment!

PHI redaction slowed telemedicine planning — DLP, BigQuery & Looker deliver safe cohorts