---
title: "How Much Do Hospital Charges Vary for Similar Inpatient Services?"
subtitle: "An Analytical Brief Using 2023 CMS Inpatient Pricing Data"
description: "A state, regional, and service-level look at how 2023 CMS inpatient hospital charges compare with Medicare payments for similar services."
author: "Markuss Saule"
date: today
date-format: "MMMM D, YYYY"
categories: ["Healthcare Pricing", "Medicare", "Data Visualization"]
title-block-banner: "#0b3954"
format:
html:
theme:
- litera
- hospital_pricing_report.scss
embed-resources: true
toc: true
toc-depth: 2
toc-location: body
toc-title: "Sections"
toc-expand: 2
number-sections: true
df-print: paged
code-fold: true
code-summary: "Show analysis code"
code-overflow: wrap
code-tools:
source: true
toggle: true
caption: "Code"
page-layout: article
smooth-scroll: true
anchor-sections: true
fig-width: 9
fig-height: 5.5
execute:
echo: true
warning: false
message: false
---
```{r setup}
library(tidyverse)
library(janitor)
library(sf)
library(scales)
library(knitr)
library(ggrepel)
options(
dplyr.summarise.inform = FALSE,
scipen = 999
)
theme_set(
theme_minimal(base_size = 12) +
theme(
plot.title.position = "plot",
plot.title = element_text(face = "bold", size = 15, color = "#102a43"),
plot.subtitle = element_text(size = 10.5, color = "#334e68"),
plot.caption = element_text(size = 9, color = "#52606d"),
axis.title = element_text(face = "bold", color = "#102a43"),
axis.text = element_text(color = "#243b53"),
legend.position = "top",
legend.title = element_text(face = "bold"),
panel.grid.minor = element_blank(),
panel.grid.major = element_line(color = "#d9e2ec")
)
)
region_colors <- c(
"West" = "#0f766e",
"Northeast" = "#a44a3f",
"South" = "#33658a",
"Midwest" = "#6a4c93"
)
amount_colors <- c(
"Submitted charge" = "#0b4f71",
"Medicare payment" = "#d17a22"
)
```
# Introduction
Hospital pricing is one of those issues that can stay abstract until the numbers are laid out side by side. Two hospitals can treat patients in the same broad clinical category and still report very different charges. This report uses CMS inpatient pricing data to ask a more specific version of that question:
**How much do hospital charges for similar inpatient services vary across hospitals and regions in the United States?**
The important detail is that this dataset is about **submitted covered charges** and Medicare-related payments, not negotiated private insurance prices and not a patient's final bill. That distinction matters. Even so, it is a strong dataset for testing whether hospitals report materially different charges for the same inpatient service category.
The analysis is built to go beyond a simple average-by-state summary. The goal is to examine variation at more than one level: across DRGs, across regions, and across states. That makes it easier to separate broad geographic patterns from hospital pricing behavior that remains uneven even after the service category is held constant.
# Data Import And Setup
The primary file is the local CMS inpatient pricing table, and I also pull a **remote geographic reference dataset** from the U.S. Census Bureau so the state-level map is tied to an external, documented boundary source.
```{r import-data}
# Main CMS inpatient pricing file
prices_raw <- read_csv(
"Medicare_IP_Hospitals_by_Provider_and_Service_2023.csv",
show_col_types = FALSE
) %>%
clean_names()
# Pull state boundaries from the Census site once, then cache them locally
census_cache <- "census_states_2023.rds"
if (file.exists(census_cache)) {
states_sf <- readRDS(census_cache)
} else {
census_states_url <- "https://www2.census.gov/geo/tiger/GENZ2023/shp/cb_2023_us_state_20m.zip"
census_zip <- tempfile(fileext = ".zip")
census_dir <- tempfile()
dir.create(census_dir)
download.file(census_states_url, census_zip, mode = "wb", quiet = TRUE)
unzip(census_zip, exdir = census_dir)
states_sf <- st_read(
list.files(census_dir, pattern = "\\.shp$", full.names = TRUE)[1],
quiet = TRUE
) %>%
clean_names() %>%
select(stusps, name, geometry)
saveRDS(states_sf, census_cache)
}
```
```{r initial-inspection}
dataset_overview <- tibble(
measure = c(
"Hospital-service rows",
"Unique hospitals",
"Unique DRGs",
"States and DC",
"Missing submitted charge rows",
"Missing Medicare payment rows"
),
value = c(
comma(nrow(prices_raw)),
comma(n_distinct(prices_raw$rndrng_prvdr_ccn)),
comma(n_distinct(prices_raw$drg_cd)),
n_distinct(prices_raw$rndrng_prvdr_state_abrvtn),
sum(is.na(prices_raw$avg_submtd_cvrd_chrg)),
sum(is.na(prices_raw$avg_mdcr_pymt_amt))
)
)
kable(dataset_overview, col.names = c("Dataset check", "Value"))
```
This file contains `r comma(nrow(prices_raw))` hospital-service rows across `r n_distinct(prices_raw$drg_cd)` DRGs. It also covers `r n_distinct(prices_raw$rndrng_prvdr_state_abrvtn)` jurisdictions, which gives the analysis enough geographic breadth to support both a regional comparison and a meaningful state map.
At a practical level, that makes it possible to study the same issue three ways at once: across services, across regions, and across states. That breadth matters because pricing inconsistency can look very different depending on which level of the system is being evaluated.
# Building The Comparison
The raw file is not yet an apples-to-apples comparison. A row in this dataset is one hospital reporting one DRG in 2023, so I trimmed it to the pricing and geography fields I actually needed, joined in Census regions, and removed the small number of rows with missing price values.
```{r wrangle-data}
state_lookup <- tibble(
state_abbr = state.abb,
state_name = state.name,
census_region = state.region,
census_division = state.division
) %>%
bind_rows(
tibble(
state_abbr = "DC",
state_name = "District of Columbia",
census_region = "South",
census_division = "South Atlantic"
)
) %>%
mutate(census_region = recode(census_region, "North Central" = "Midwest"))
prices <- prices_raw %>%
select(
rndrng_prvdr_ccn,
rndrng_prvdr_org_name,
rndrng_prvdr_city,
rndrng_prvdr_state_abrvtn,
rndrng_prvdr_ruca_desc,
drg_cd,
drg_desc,
tot_dschrgs,
avg_submtd_cvrd_chrg,
avg_tot_pymt_amt,
avg_mdcr_pymt_amt
) %>%
filter(
!is.na(avg_submtd_cvrd_chrg),
!is.na(avg_mdcr_pymt_amt),
!is.na(drg_desc)
) %>%
mutate(
drg_desc = str_squish(drg_desc),
charge_to_medicare_ratio = avg_submtd_cvrd_chrg / avg_mdcr_pymt_amt
) %>%
left_join(
state_lookup,
by = c("rndrng_prvdr_state_abrvtn" = "state_abbr")
) %>%
arrange(desc(avg_submtd_cvrd_chrg))
```
With that cleaned table in place, the comparison is built in layers. First, the analysis measures how much charges spread within each DRG. Then it narrows to one high-volume DRG for a deeper look so the regional and state comparisons stay closer to apples-to-apples.
# Findings
```{r analytical-objects}
service_spread <- prices %>%
group_by(drg_cd, drg_desc) %>%
summarize(
hospitals = n(),
median_charge = median(avg_submtd_cvrd_chrg, na.rm = TRUE),
median_payment = median(avg_mdcr_pymt_amt, na.rm = TRUE),
p90_charge = quantile(avg_submtd_cvrd_chrg, 0.9, na.rm = TRUE),
p10_charge = quantile(avg_submtd_cvrd_chrg, 0.1, na.rm = TRUE),
charge_ratio_90_10 = p90_charge / p10_charge,
charge_to_payment_ratio = median_charge / median_payment,
.groups = "drop"
) %>%
filter(hospitals >= 1000) %>%
arrange(desc(charge_ratio_90_10))
focus_service <- service_spread %>%
slice_head(n = 1)
focus_code <- focus_service$drg_cd[[1]]
focus_label <- focus_service$drg_desc[[1]]
focus_prices <- prices %>%
filter(drg_cd == focus_code)
focus_benchmark <- focus_prices %>%
summarize(
hospitals = n(),
median_charge = median(avg_submtd_cvrd_chrg, na.rm = TRUE),
median_payment = median(avg_mdcr_pymt_amt, na.rm = TRUE),
p10_charge = quantile(avg_submtd_cvrd_chrg, 0.1, na.rm = TRUE),
p90_charge = quantile(avg_submtd_cvrd_chrg, 0.9, na.rm = TRUE),
share_over_100k = mean(avg_submtd_cvrd_chrg > 100000, na.rm = TRUE)
)
focus_region_summary <- focus_prices %>%
group_by(census_region) %>%
summarize(
hospitals = n(),
median_charge = median(avg_submtd_cvrd_chrg, na.rm = TRUE),
median_payment = median(avg_mdcr_pymt_amt, na.rm = TRUE),
charge_to_payment_ratio = median_charge / median_payment,
.groups = "drop"
) %>%
arrange(desc(median_charge))
focus_region_long <- focus_region_summary %>%
select(census_region, median_charge, median_payment) %>%
pivot_longer(
cols = c(median_charge, median_payment),
names_to = "amount_type",
values_to = "amount"
) %>%
mutate(
amount_type = recode(
amount_type,
median_charge = "Submitted charge",
median_payment = "Medicare payment"
)
)
state_focus_summary <- focus_prices %>%
group_by(rndrng_prvdr_state_abrvtn, state_name, census_region) %>%
summarize(
hospitals = n(),
median_charge = median(avg_submtd_cvrd_chrg, na.rm = TRUE),
median_payment = median(avg_mdcr_pymt_amt, na.rm = TRUE),
charge_to_payment_ratio = median_charge / median_payment,
.groups = "drop"
) %>%
filter(hospitals >= 10) %>%
arrange(desc(median_charge))
top_state <- state_focus_summary %>%
slice_max(median_charge, n = 1)
bottom_state <- state_focus_summary %>%
slice_min(median_charge, n = 1)
top_bottom_states <- bind_rows(
state_focus_summary %>%
slice_max(median_charge, n = 5) %>%
mutate(group = "Highest median charges"),
state_focus_summary %>%
slice_min(median_charge, n = 5) %>%
mutate(group = "Lowest median charges")
) %>%
mutate(group = factor(group, levels = c("Highest median charges", "Lowest median charges"))) %>%
arrange(group, desc(median_charge)) %>%
select(group, state_name, census_region, hospitals, median_charge, median_payment, charge_to_payment_ratio)
state_label_data <- bind_rows(
state_focus_summary %>% slice_max(median_charge, n = 3),
state_focus_summary %>% slice_min(median_charge, n = 3),
state_focus_summary %>% slice_max(charge_to_payment_ratio, n = 2),
state_focus_summary %>% slice_min(charge_to_payment_ratio, n = 2)
) %>%
distinct(rndrng_prvdr_state_abrvtn, .keep_all = TRUE)
rural_focus_summary <- focus_prices %>%
mutate(
rural_group = case_when(
str_detect(rndrng_prvdr_ruca_desc, "Metropolitan") ~ "Metropolitan",
str_detect(rndrng_prvdr_ruca_desc, "Micropolitan") ~ "Micropolitan",
TRUE ~ "Small town / rural"
)
) %>%
group_by(rural_group) %>%
summarize(
hospitals = n(),
median_charge = median(avg_submtd_cvrd_chrg, na.rm = TRUE),
median_payment = median(avg_mdcr_pymt_amt, na.rm = TRUE),
charge_to_payment_ratio = median_charge / median_payment,
.groups = "drop"
) %>%
arrange(desc(median_charge))
charge_payment_correlation <- cor(
state_focus_summary$median_charge,
state_focus_summary$median_payment
)
common_markup_services <- service_spread %>%
arrange(desc(charge_to_payment_ratio)) %>%
slice_head(n = 6)
map_data <- states_sf %>%
filter(stusps %in% unique(prices$rndrng_prvdr_state_abrvtn)) %>%
left_join(
state_focus_summary %>%
mutate(median_charge = if_else(hospitals >= 10, median_charge, NA_real_)),
by = c("stusps" = "rndrng_prvdr_state_abrvtn")
)
map_data_contiguous <- map_data %>%
filter(!stusps %in% c("AK", "HI"))
```
The deeper analysis centers on **DRG `r focus_code`**, `r focus_label`, because it is both the most common DRG in the file and the one with the widest 90th-to-10th percentile charge spread among common services. That makes it a stronger focal example than a rare or thinly reported DRG.
## Executive Summary
::: {.callout-important appearance="simple"}
## Bottom Line
- Within the focus DRG, the 90th percentile hospital reports a submitted charge of about `r dollar(focus_benchmark$p90_charge)`, compared with about `r dollar(focus_benchmark$p10_charge)` at the 10th percentile.
- In the West, the median submitted charge is about `r number(focus_region_summary$charge_to_payment_ratio[focus_region_summary$census_region == "West"], accuracy = 0.1)` times the median Medicare payment for the same DRG.
- Metropolitan hospitals report much higher median charges than small-town and rural hospitals for this DRG, but Medicare payments stay relatively close across those settings.
- At the state level, the median submitted charge ranges from about `r dollar(bottom_state$median_charge[[1]])` in `r bottom_state$state_name[[1]]` to about `r dollar(top_state$median_charge[[1]])` in `r top_state$state_name[[1]]`.
:::
## The Same DRG Can Sit In Very Different Price Tiers
```{r plot-service-spread, fig.height=6.6}
service_spread %>%
slice_head(n = 10) %>%
mutate(drg_desc = fct_reorder(str_wrap(drg_desc, width = 44), median_charge)) %>%
ggplot(aes(y = drg_desc)) +
geom_segment(
aes(x = p10_charge, xend = p90_charge, yend = drg_desc),
color = "#bcccdc",
linewidth = 2.3
) +
geom_point(aes(x = p10_charge), color = "#d17a22", size = 3.2) +
geom_point(
aes(x = median_charge),
shape = 21,
size = 3.1,
stroke = 1.1,
fill = "white",
color = "#102a43"
) +
geom_point(aes(x = p90_charge), color = "#0b4f71", size = 3.6) +
geom_text(
aes(x = p90_charge, label = paste0(number(charge_ratio_90_10, accuracy = 0.1), "x")),
nudge_x = 9000,
size = 3.2,
color = "#243b53"
) +
scale_x_continuous(
labels = label_dollar(scale_cut = cut_short_scale()),
expand = expansion(mult = c(0.02, 0.2))
) +
labs(
title = "Common inpatient services occupy very different low-end and high-end price tiers",
subtitle = "Each line runs from the 10th percentile hospital charge to the 90th percentile hospital charge within a DRG. The white point marks the median.",
x = "Average submitted covered charge",
y = NULL,
caption = "Numbers at the right show the 90th-to-10th percentile ratio within each DRG."
)
```
This is the clearest sign that the variation is not a side issue. For the focus DRG, the 10th percentile hospital reports a submitted charge of about `r dollar(focus_benchmark$p10_charge)` while the 90th percentile hospital reports about `r dollar(focus_benchmark$p90_charge)`. That is a spread of more than `r dollar(focus_benchmark$p90_charge - focus_benchmark$p10_charge)` inside the same service category. About `r percent(focus_benchmark$share_over_100k, accuracy = 0.1)` of hospitals in this DRG still report charges above $100,000, so the upper tail is not just one or two extreme hospitals.
One extra pattern is worth noting here: the services with the widest hospital-to-hospital spread are not exactly the same as the services with the biggest charge-to-payment multiples. Cardiovascular admissions show up repeatedly when the comparison shifts from spread to markup.
```{r markup-services-table}
common_markup_services %>%
transmute(
drg_cd,
drg_desc,
hospitals,
median_charge = dollar(median_charge),
median_payment = dollar(median_payment),
charge_to_payment_ratio = number(charge_to_payment_ratio, accuracy = 0.01)
) %>%
kable(
format = "html",
align = c("l", "l", "r", "r", "r", "r"),
col.names = c(
"DRG",
"Service description",
"Rows",
"Median charge",
"Median Medicare payment",
"Charge / payment ratio"
)
)
```
## The Regional Story Is Really A Gap Story
```{r plot-region-gap, fig.height=5.4}
region_order <- rev(focus_region_summary$census_region)
ggplot(
focus_region_summary,
aes(y = factor(census_region, levels = region_order))
) +
geom_segment(
aes(
x = median_payment,
xend = median_charge,
yend = factor(census_region, levels = region_order)
),
color = "#bcccdc",
linewidth = 2.6
) +
geom_point(
data = focus_region_long %>%
mutate(census_region = factor(census_region, levels = region_order)),
aes(x = amount, color = amount_type),
size = 4
) +
geom_text(
aes(
x = median_charge,
label = paste0(number(charge_to_payment_ratio, accuracy = 0.1), "x")
),
nudge_x = 4000,
size = 3.2,
color = "#243b53"
) +
scale_color_manual(values = amount_colors) +
scale_x_continuous(
labels = label_dollar(scale_cut = cut_short_scale()),
expand = expansion(mult = c(0.02, 0.18))
) +
labs(
title = paste0("For DRG ", focus_code, ", regional charges sit far above regional Medicare payments"),
subtitle = "The marker gap shows how much more hospitals list as submitted charges than Medicare pays for the same DRG.",
x = "Regional median amount",
y = NULL,
color = NULL,
caption = paste0("Focus DRG: ", focus_label)
)
```
This is where the interpretation becomes more useful. The West has the highest median submitted charge at about `r dollar(focus_region_summary$median_charge[focus_region_summary$census_region == "West"])`, but its median Medicare payment is only about `r dollar(focus_region_summary$median_payment[focus_region_summary$census_region == "West"])`. That is a `r number(focus_region_summary$charge_to_payment_ratio[focus_region_summary$census_region == "West"], accuracy = 0.1)`-to-1 ratio. The Midwest is lower at about `r number(focus_region_summary$charge_to_payment_ratio[focus_region_summary$census_region == "Midwest"], accuracy = 0.1)` to 1, but the same basic separation remains.
That gap matters because Medicare is functioning here as a rough payment anchor. If reimbursement differences were doing most of the work, the regional lines would compress much more than they do. Instead, the payment side moves modestly while the charge side stays highly elastic. That is a sign that hospital list-price architecture is being set with much more discretion than the reimbursement benchmark.
A rurality check points in the same direction. Metropolitan hospitals in this DRG have a median submitted charge of about `r dollar(rural_focus_summary$median_charge[rural_focus_summary$rural_group == "Metropolitan"])`, compared with about `r dollar(rural_focus_summary$median_charge[rural_focus_summary$rural_group == "Small town / rural"])` in small-town and rural hospitals. Median Medicare payments, though, stay much tighter: about `r dollar(rural_focus_summary$median_payment[rural_focus_summary$rural_group == "Metropolitan"])` in metropolitan hospitals and `r dollar(rural_focus_summary$median_payment[rural_focus_summary$rural_group == "Small town / rural"])` in small-town and rural hospitals. One plausible reading is that scale, market power, and internal charge-master strategy are widening the listed-price gap more than reimbursement policy is.
## State Medians Still Spread Out Sharply
```{r plot-map, fig.width=10.5, fig.height=7.2, out.width="100%", fig.align="center"}
ggplot(map_data_contiguous) +
geom_sf(aes(fill = median_charge), color = "white", linewidth = 0.25) +
scale_fill_gradientn(
colors = c("#e8f1f2", "#b9d6d9", "#5b8e98", "#0b3954"),
breaks = breaks_pretty(n = 5),
labels = label_dollar(scale_cut = cut_short_scale()),
na.value = "#f1f5f9",
name = "Median charge",
guide = guide_colorbar(
title.position = "top",
title.hjust = 0.5,
barheight = grid::unit(90, "pt"),
barwidth = grid::unit(10, "pt"),
ticks.colour = "#486581",
frame.colour = "#bcccdc"
)
) +
coord_sf(datum = NA) +
labs(
title = "Even within one DRG, state medians are far from uniform",
subtitle = "Contiguous U.S. view; states with fewer than 10 reporting hospitals for this DRG are left blank.",
caption = "Alaska and Hawaii remain in the numeric summaries. Boundaries: 2023 U.S. Census Bureau."
) +
theme_void() +
theme(
legend.position = "right",
legend.justification = "center",
legend.title = element_text(size = 10, face = "bold"),
legend.text = element_text(size = 9),
plot.title.position = "plot",
plot.caption.position = "plot",
plot.margin = margin(10, 10, 24, 10)
)
```
```{r top-bottom-table}
top_bottom_states %>%
mutate(
median_charge = dollar(median_charge),
median_payment = dollar(median_payment),
charge_to_payment_ratio = number(charge_to_payment_ratio, accuracy = 0.01)
) %>%
kable(
col.names = c(
"Group",
"State",
"Region",
"Hospitals",
"Median submitted charge",
"Median Medicare payment",
"Charge / payment ratio"
)
)
```
The map shows that the regional story does not stop at the region boundary. Among states with at least 10 reporting hospitals for this DRG, `r top_state$state_name[[1]]` has a median submitted charge of about `r dollar(top_state$median_charge[[1]])`, while `r bottom_state$state_name[[1]]` is closer to `r dollar(bottom_state$median_charge[[1]])`. That is roughly a `r number(top_state$median_charge[[1]] / bottom_state$median_charge[[1]], accuracy = 0.01)`-to-1 gap for the same DRG, even after dropping tiny state samples.
The more important point is that these state gaps are not simply random noise around a regional average. They suggest that pricing governance is uneven inside the same broad reimbursement environment. For an executive audience, that changes the question from "which regions run hot?" to "which operating environments appear to permit much wider charge inflation than others?"
## Higher Charges Do Not Buy Proportionally Higher Medicare Payments
```{r plot-state-scatter, fig.height=6.2}
ggplot(
state_focus_summary,
aes(
x = median_payment,
y = median_charge,
color = census_region,
size = hospitals
)
) +
geom_point(alpha = 0.82) +
geom_smooth(
data = state_focus_summary,
aes(x = median_payment, y = median_charge),
inherit.aes = FALSE,
method = "lm",
se = FALSE,
linewidth = 0.9,
linetype = "dashed",
color = "#829ab1"
) +
geom_text_repel(
data = state_label_data,
aes(label = rndrng_prvdr_state_abrvtn),
size = 3.2,
box.padding = 0.35,
point.padding = 0.25,
segment.color = "#9fb3c8",
max.overlaps = Inf,
show.legend = FALSE
) +
scale_color_manual(values = region_colors) +
scale_size_continuous(range = c(2.5, 8)) +
scale_x_continuous(labels = label_dollar()) +
scale_y_continuous(labels = label_dollar(scale_cut = cut_short_scale())) +
labs(
title = "States with similar Medicare payments can still look very different on charges",
subtitle = paste0("Each point is a state median for DRG ", focus_code, ". Point size reflects the number of reporting hospitals."),
x = "State median Medicare payment",
y = "State median submitted charge",
color = "Region",
size = "Hospitals"
)
```
If Medicare reimbursement were the main thing driving the charge story, the points above would line up much more tightly. Instead, the relationship is only moderate (`r number(charge_payment_correlation, accuracy = 0.01)`), and states with relatively similar payment medians still separate sharply on the charge axis. That is why the charge variation reads as more than a reimbursement story. Charges do respond to geography and hospital setting, but they also appear to reflect materially different pricing policies layered on top of those fundamentals.
That distinction matters operationally. A payment benchmark that moves within a relatively narrow band can coexist with a much looser charge regime. When that happens, the charge master stops looking like a neutral pricing reference and starts looking more like an internal strategic instrument, one that may affect payer negotiation posture, patient financial counseling, and public transparency perception in different ways.
# What A Stakeholder Should Take From This
Three conclusions stand out from the data:
1. The inconsistency is already large before any cross-service averaging. Once I compare hospitals within the same DRG, the spread is still wide enough to matter.
2. Geography matters, but it is not a complete explanation. The West is generally higher in the focus DRG, yet state-level gaps inside and across regions are still too large to dismiss as a simple regional cost story.
3. The charge system looks less disciplined than the payment system. Medicare payments move, but not nearly as much as submitted charges do. That makes the data look more like a markup structure than a stable market price.
For a hospital system, insurer, or policy audience, that matters because the operational question changes. The issue is not just "which places are expensive?" It becomes "why are listed charges for the same DRG so flexible when reimbursement benchmarks are much tighter?" That is where cost accounting, charge-master governance, contracting strategy, and market structure start to matter.
The findings also point toward a practical prioritization logic. A system trying to review pricing discipline would not need to audit every DRG at once. It could start with high-volume DRGs that show both large hospital-to-hospital spread and high charge-to-payment multiples, especially where payment benchmarks remain relatively compressed. Those are the service lines where charge review is most likely to surface policy choices rather than unavoidable reimbursement differences.
# Limitations
This dataset is useful, but it has clear limits, and I do not want to overclaim what it proves.
- These are **submitted covered charges**, not negotiated commercial prices and not the actual out-of-pocket amount a patient pays.
- The data are at the hospital-by-DRG level, so they do not include individual patient severity beyond what is already built into the DRG categories.
- Differences across states could partly reflect cost of living, labor costs, teaching hospitals, case mix, or reporting behavior.
- The analysis is descriptive. It shows variation, but it does not identify a single cause of that variation.
- Some states have much smaller hospital counts for specific DRGs, which is why I used a 10-hospital threshold for the map comparison.
# Conclusion
My main conclusion is that hospital-reported inpatient charges are not behaving like a stable, service-specific market price. Even after narrowing the comparison to the same DRG, hospitals and states still separate into very different charge tiers. In the focus DRG, the spread is large at the national level, it remains large across regions, and it remains large again at the state level.
The payment comparison is what makes that result more meaningful. Medicare payments do move across places, but not nearly enough to explain the full spread in submitted charges. The strongest reading is not that every patient faces these prices directly. It is that the charge structure itself is highly inconsistent, which is exactly the kind of pattern that makes transparency hard to interpret, weakens comparability across hospitals, and raises governance questions around how listed prices are being set.
# Next Questions
Two follow-up questions would deepen this analysis further. First, does the same charge-versus-payment pattern hold in a high-volume surgical DRG, or is the current result especially concentrated in medical admissions such as sepsis? Second, how much of the state-level spread survives after adding hospital characteristics such as teaching status, ownership, bed size, or system affiliation? Those are the next layers most likely to distinguish structural cost differences from discretionary pricing behavior.