← Back to Projects
Active Development

Fraud Detection & Anomaly Analysis

Finding the needles in the haystack: Statistical, geographic, and network-based techniques to identify billing outliers, prescribing anomalies, and suspicious provider patterns in 100M+ Medicare records.

The Problem

$60B-100B
Annual Fraud
3-10%
Of Total Spending
1B+
Claims Per Year

Healthcare fraud costs taxpayers tens of billions annually, but detecting it is like finding needles in haystacks. Traditional approaches rely on tips, audits, and post-payment review β€” reactive, slow, and expensive.

The opportunity: CMS publishes detailed utilization data on every Medicare provider. While individual claims are protected, aggregated provider-level statistics reveal patterns. Statistical outliers, geographic anomalies, and suspicious networks can be detected using public data alone.

Common Fraud Schemes:

  • Pill mills: Prescribing opioids without medical justification
  • DME fraud: Billing for equipment never delivered
  • Upcoding: Billing for more expensive services than provided
  • Phantom billing: Billing for services never rendered
  • Kickback schemes: Illegal referral arrangements between providers
  • COVID testing scams: Excessive or fake testing during pandemic

Five Detection Approaches

Multiple analytical techniques, each revealing different fraud patterns

πŸ“Š

Statistical Outlier Detection

Providers billing significantly above peers

Calculate z-scores for key metrics (total payment, services per beneficiary, average charge) and flag providers more than 3 standard deviations above their specialty/geography peers.

Red Flags:

  • Top 1% Medicare payment by specialty (potential upcoding)
  • Services per beneficiary 3+ std dev above mean (volume fraud)
  • High beneficiary churn rate (doctor shopping enabler)
  • Unusual service mix (billing procedures outside specialty)

Data: Medicare Part B utilization by provider, specialty benchmarks

πŸ’Š

Prescribing Pattern Analysis

Opioid mills and kickback indicators

Analyze Part D prescriber data to identify dangerous or suspicious prescribing behavior: excessive opioids, brand-name preference without justification, or single-drug specialists.

Red Flags:

  • Opioid prescribing rate > 95th percentile (pill mill indicator)
  • Brand-name preference > 90% (kickback suspicion)
  • Single-drug specialist (99% claims for one medication)
  • Contraindicated drug combinations (dangerous or fraudulent)
  • High-dose oxycodone without cancer diagnosis

Data: Part D prescriber by drug, opioid flags, Open Payments pharma relationships

πŸ—ΊοΈ

Geographic Clustering

Fraud hotspots by county/zip code

Map provider density, claim volume, and payment by geography. Identify zip codes with unusual patterns: too many providers for population, sudden spikes, or known fraud regions.

Red Flags:

  • Provider density >> population (fraud factories)
  • Claim volume spikes (sudden increase in one area)
  • Known hotspots (South Florida, Southern California DME clusters)
  • Billing from residential addresses (phantom clinics)
  • Interstate patient flows (long-distance fraud rings)

Visualization: Heatmap of providers per capita, claim volume timeline by county

πŸ”—

Network Analysis

Referral rings and kickback schemes

Build provider referral networks from facility affiliations and shared patients. Detect circular referrals, exclusive relationships, and shell company patterns.

Red Flags:

  • Circular referrals (A β†’ B β†’ C β†’ A)
  • Exclusive relationships (doc only refers to one facility, owns equity)
  • Shared addresses (multiple NPIs, same location = shell companies)
  • Open Payments connections (ownership + high referral volume)
  • Isolated clusters (tight networks disconnected from mainstream)

Data: DAC facility affiliations, Open Payments ownership interests, shared billing addresses

πŸ“ˆ

Time Series Anomalies

Sudden changes in billing patterns

Analyze year-over-year billing trends to detect sudden spikes, new procedure codes, or pre-retirement fraud patterns.

Red Flags:

  • Billing spike (3x normal volume in one month)
  • New specialty codes (provider suddenly billing unfamiliar procedures)
  • COVID testing scams (massive 2020-2021 spike, then disappears)
  • Pre-retirement cash-out (final year billing surge before leaving)
  • Post-disaster fraud (Hurricane Harvey, COVID β€” opportunistic schemes)

Data: Medicare utilization 2013-2024 trend data, year-over-year comparisons

Example Patterns We're Looking For

These are real fraud scheme types documented in DOJ cases. Our analysis will identify similar patterns in public data:

The Pill Mill

High Priority

Pattern: Provider prescribing 10x peer opioid rate, high cash-pay patients, minimal documentation

Detection: Prescribing outlier analysis + geographic clustering

Real example: Appalachian "pain clinics" (2015-2017 opioid crisis epicenter)

DME Kickback Ring

Medium Priority

Pattern: Circular referrals (doc β†’ DME supplier β†’ doc's office), shared addresses, Open Payments ties

Detection: Network analysis + Open Payments cross-reference

Real example: South Florida DME fraud rings ($1.2B scheme, 2019)

COVID Testing Scam

High Priority

Pattern: Massive billing spike (March-Dec 2020), then disappears; no prior billing history; residential addresses

Detection: Time series anomaly + geographic analysis

Real example: Fake testing sites, excessive billing for asymptomatic patients

Pre-Retirement Cash-Out

Watch List

Pattern: Provider billing 3x normal volume in final year before retirement, unfamiliar procedure codes, age 65+

Detection: Time series analysis + age correlation

Note: Not always fraud (retiring docs work harder), but worth investigation

Data Sources & Validation

All analysis uses CMS public use files β€” no PHI, no restricted data. Findings are validated against known fraud cases and exclusion lists.

Medicare Part B Utilization

1.2M providers, total services, payments, beneficiaries by provider

Primary Source

Part D Prescriber Data

1.4M prescribers, 26M drug claims, opioid flags, brand vs. generic rates

Primary Source

Open Payments (Sunshine Act)

15.7M industry payments, ownership interests, conflict of interest detection

Primary Source

DAC Facility Affiliations

1.6M provider-hospital relationships for referral network construction

Primary Source

OIG LEIE Exclusion List

Known bad actors (convicted fraudsters) for validation cross-check

Validation

DOJ Press Releases

Announced fraud cases (2018-2024) to validate detection techniques

Validation

Project Status & Next Steps

Methodology Design

Complete

Statistical Analysis

In Progress

Running outlier detection, z-score calculations for 1.2M providers

Network Analysis

Planned Q2 2026

Case Studies & Writeup

Target Q2-Q3 2026

Interactive Tools

Target Q3 2026

Provider outlier explorer, geographic heatmap, network visualizer

Important Disclaimers

  • Statistical anomalies β‰  proof of fraud. Legitimate outliers exist (rural providers, specialized practices).
  • These are patterns worth investigating, not accusations. Real fraud requires investigation.
  • False positives are expected. We aim to minimize them, but they're unavoidable with statistical methods.
  • Data is 1-2 years old. Some flagged providers may already be under investigation or convicted.
  • This is research, not law enforcement. Findings are for educational and policy purposes.

Want Fraud Detection for Your Claims Data?

This methodology can be applied to any claims dataset β€” commercial insurance, Medicaid, health system internal data. If you need fraud detection, waste identification, or anomaly monitoring β€” we can build custom systems tailored to your data.

Contact Us Back to Projects