Fraud Detection & Anomaly Analysis
Finding the needles in the haystack: Statistical, geographic, and network-based techniques to identify billing outliers, prescribing anomalies, and suspicious provider patterns in 100M+ Medicare records.
The Problem
Healthcare fraud costs taxpayers tens of billions annually, but detecting it is like finding needles in haystacks. Traditional approaches rely on tips, audits, and post-payment review β reactive, slow, and expensive.
The opportunity: CMS publishes detailed utilization data on every Medicare provider. While individual claims are protected, aggregated provider-level statistics reveal patterns. Statistical outliers, geographic anomalies, and suspicious networks can be detected using public data alone.
Common Fraud Schemes:
- Pill mills: Prescribing opioids without medical justification
- DME fraud: Billing for equipment never delivered
- Upcoding: Billing for more expensive services than provided
- Phantom billing: Billing for services never rendered
- Kickback schemes: Illegal referral arrangements between providers
- COVID testing scams: Excessive or fake testing during pandemic
Five Detection Approaches
Multiple analytical techniques, each revealing different fraud patterns
Statistical Outlier Detection
Providers billing significantly above peers
Calculate z-scores for key metrics (total payment, services per beneficiary, average charge) and flag providers more than 3 standard deviations above their specialty/geography peers.
Red Flags:
- Top 1% Medicare payment by specialty (potential upcoding)
- Services per beneficiary 3+ std dev above mean (volume fraud)
- High beneficiary churn rate (doctor shopping enabler)
- Unusual service mix (billing procedures outside specialty)
Data: Medicare Part B utilization by provider, specialty benchmarks
Prescribing Pattern Analysis
Opioid mills and kickback indicators
Analyze Part D prescriber data to identify dangerous or suspicious prescribing behavior: excessive opioids, brand-name preference without justification, or single-drug specialists.
Red Flags:
- Opioid prescribing rate > 95th percentile (pill mill indicator)
- Brand-name preference > 90% (kickback suspicion)
- Single-drug specialist (99% claims for one medication)
- Contraindicated drug combinations (dangerous or fraudulent)
- High-dose oxycodone without cancer diagnosis
Data: Part D prescriber by drug, opioid flags, Open Payments pharma relationships
Geographic Clustering
Fraud hotspots by county/zip code
Map provider density, claim volume, and payment by geography. Identify zip codes with unusual patterns: too many providers for population, sudden spikes, or known fraud regions.
Red Flags:
- Provider density >> population (fraud factories)
- Claim volume spikes (sudden increase in one area)
- Known hotspots (South Florida, Southern California DME clusters)
- Billing from residential addresses (phantom clinics)
- Interstate patient flows (long-distance fraud rings)
Visualization: Heatmap of providers per capita, claim volume timeline by county
Network Analysis
Referral rings and kickback schemes
Build provider referral networks from facility affiliations and shared patients. Detect circular referrals, exclusive relationships, and shell company patterns.
Red Flags:
- Circular referrals (A β B β C β A)
- Exclusive relationships (doc only refers to one facility, owns equity)
- Shared addresses (multiple NPIs, same location = shell companies)
- Open Payments connections (ownership + high referral volume)
- Isolated clusters (tight networks disconnected from mainstream)
Data: DAC facility affiliations, Open Payments ownership interests, shared billing addresses
Time Series Anomalies
Sudden changes in billing patterns
Analyze year-over-year billing trends to detect sudden spikes, new procedure codes, or pre-retirement fraud patterns.
Red Flags:
- Billing spike (3x normal volume in one month)
- New specialty codes (provider suddenly billing unfamiliar procedures)
- COVID testing scams (massive 2020-2021 spike, then disappears)
- Pre-retirement cash-out (final year billing surge before leaving)
- Post-disaster fraud (Hurricane Harvey, COVID β opportunistic schemes)
Data: Medicare utilization 2013-2024 trend data, year-over-year comparisons
Example Patterns We're Looking For
These are real fraud scheme types documented in DOJ cases. Our analysis will identify similar patterns in public data:
The Pill Mill
High PriorityPattern: Provider prescribing 10x peer opioid rate, high cash-pay patients, minimal documentation
Detection: Prescribing outlier analysis + geographic clustering
Real example: Appalachian "pain clinics" (2015-2017 opioid crisis epicenter)
DME Kickback Ring
Medium PriorityPattern: Circular referrals (doc β DME supplier β doc's office), shared addresses, Open Payments ties
Detection: Network analysis + Open Payments cross-reference
Real example: South Florida DME fraud rings ($1.2B scheme, 2019)
COVID Testing Scam
High PriorityPattern: Massive billing spike (March-Dec 2020), then disappears; no prior billing history; residential addresses
Detection: Time series anomaly + geographic analysis
Real example: Fake testing sites, excessive billing for asymptomatic patients
Pre-Retirement Cash-Out
Watch ListPattern: Provider billing 3x normal volume in final year before retirement, unfamiliar procedure codes, age 65+
Detection: Time series analysis + age correlation
Note: Not always fraud (retiring docs work harder), but worth investigation
Data Sources & Validation
All analysis uses CMS public use files β no PHI, no restricted data. Findings are validated against known fraud cases and exclusion lists.
Medicare Part B Utilization
1.2M providers, total services, payments, beneficiaries by provider
Primary SourcePart D Prescriber Data
1.4M prescribers, 26M drug claims, opioid flags, brand vs. generic rates
Primary SourceOpen Payments (Sunshine Act)
15.7M industry payments, ownership interests, conflict of interest detection
Primary SourceDAC Facility Affiliations
1.6M provider-hospital relationships for referral network construction
Primary SourceOIG LEIE Exclusion List
Known bad actors (convicted fraudsters) for validation cross-check
ValidationDOJ Press Releases
Announced fraud cases (2018-2024) to validate detection techniques
ValidationProject Status & Next Steps
Methodology Design
CompleteStatistical Analysis
In ProgressRunning outlier detection, z-score calculations for 1.2M providers
Network Analysis
Planned Q2 2026Case Studies & Writeup
Target Q2-Q3 2026Interactive Tools
Target Q3 2026Provider outlier explorer, geographic heatmap, network visualizer
Important Disclaimers
- Statistical anomalies β proof of fraud. Legitimate outliers exist (rural providers, specialized practices).
- These are patterns worth investigating, not accusations. Real fraud requires investigation.
- False positives are expected. We aim to minimize them, but they're unavoidable with statistical methods.
- Data is 1-2 years old. Some flagged providers may already be under investigation or convicted.
- This is research, not law enforcement. Findings are for educational and policy purposes.
Want Fraud Detection for Your Claims Data?
This methodology can be applied to any claims dataset β commercial insurance, Medicaid, health system internal data. If you need fraud detection, waste identification, or anomaly monitoring β we can build custom systems tailored to your data.
Contact Us Back to Projects