Healthcare Data Projects

Real analysis on real data. Showcasing what's possible when you combine domain expertise, data engineering, and modern analytics on 100M+ Medicare records.

Featured Projects

Each project demonstrates different analytical approaches and business applications using CMS public data

Healthcare Cost Analysis

Healthcare Cost Analysis

Research Phase

The Question: What does healthcare actually cost to deliver in the United States?

We spend $4.5 trillion annually on healthcare — but how much of that is direct patient care versus administrative overhead, insurance complexity, and systemic waste? This multi-method research project uses hospital cost reports, Medicare utilization data, and national spending tables to triangulate the true cost.

Four Calculation Approaches:

  • Provider capacity model (physician count × volume × cost/encounter)
  • Facility cost model (hospital beds × occupancy × cost/bed-day)
  • Claims bottom-up (Medicare → age-adjust → extrapolate)
  • Top-down national accounting ($4.5T → decompose → net out waste)

Status: Research design complete, data acquisition in progress, pilot calculations starting Q2 2026

View Project Details
Fraud Detection

Fraud Detection & Anomaly Analysis

Active

The Problem: $60-100 billion in healthcare fraud annually. Can we find it in public data?

Medicare fraud costs taxpayers billions, but detecting it requires finding needles in haystacks of legitimate billing. This project applies statistical, geographic, and network-based techniques to identify billing outliers, prescribing anomalies, and suspicious provider patterns.

Five Detection Approaches:

  • Statistical outlier detection (billing 3+ std dev above peers)
  • Prescribing pattern analysis (pill mills, opioid red flags)
  • Geographic clustering (fraud hotspots by county/zip)
  • Network analysis (referral rings, kickback schemes)
  • Time series anomalies (COVID scams, pre-retirement fraud)

Status: Methodology complete, initial outlier analysis underway, case studies targeted for Q2 2026

View Project Details

What These Projects Showcase

Demonstrating end-to-end capabilities in healthcare data intelligence

🎯 Domain Expertise

Deep understanding of healthcare business models, provider networks, claims data structures, and regulatory frameworks (Medicare, CMS, HIPAA, Stark Law).

📊 Data Engineering

Building production data pipelines: acquiring, cleaning, transforming, and serving 100M+ records. DuckDB, FastAPI, nginx, Docker — the full stack.

🔬 Analytical Rigor

Multiple validation methods, transparent assumptions, peer-reviewable methodology. Academic-quality analysis, business-friendly presentation.

💼 Business Value

Every analysis answers a real business question: How can we reduce costs? Where's the fraud? What's the ROI? Data for decisions, not just dashboards.

Want Custom Analysis for Your Organization?

These projects are proof-of-concept for what's possible with healthcare data. If you need similar analysis for your health system, claims data, or provider network — we can build it.

Contact Us

Coming Soon

Projects in the pipeline for 2026

📍

Provider Network Intelligence

Mapping referral patterns, identifying key opinion leaders, and quantifying provider influence within regional healthcare networks.

Q2 2026
💊

Prescribing Pattern Analysis

Deep-dive into Part D prescriber data: brand vs. generic preferences, specialty drug adoption patterns, and physician-pharma relationships.

Q3 2026
🏥

Hospital Market Analysis

Competitive dynamics in major metro markets: bed capacity, service lines, patient migration, and market share shifts over time.

Q3 2026
⚖️

Price Transparency Analysis

What hospital price transparency data reveals about negotiated rates, price variation, and the true cost of common procedures.

Q4 2026