RESEARCH PAPER

The GPU Rental Market: Price Dispersion and the Cost Curve of Compute

A four-year cross-provider panel of GPU spot prices across 38 clouds

StatusPreprint
DateMay 2026
Authorsgerra
Panel38 providers, 32 SKUs
01

Abstract

We assemble a normalized, cross-provider panel of GPU rental prices and use it to characterize the structure of the accelerator-rental market. The panel covers 6,216 price observations across 38 cloud providers and 32 GPU SKUs, spanning 2022 to 2026, with prices normalized to canonical SKUs on a per-hour basis.

Two findings stand out. First, price dispersion is extreme: the same accelerator rents for radically different prices across providers, with an 18x range on H100, 96x on A100, and 100x on RTX-4090. The market is fragmented and far from a single clearing price. Second, on the consumer-GPU segment, where we hold a continuous multi-year panel, rental rates declined steadily. We are deliberate about the data's limits: datacenter-SKU history is still thin, and we report an honest null result on whether GPU prices yet constitute a tradeable signal for AI-infrastructure equities.

02

Data & Coverage

Prices are collected by polling each provider's public pricing surface (hourly where available) and backfilled from archived snapshots. Heterogeneous instance names are mapped to a canonical SKU set (for example, the many marketplace spellings of an 80GB SXM H100 collapse to one identifier), so that prices are comparable across providers. The current panel:

DimensionValue
Price observations6,216 (price > 0)
Providers38
Canonical SKUs32
Coverage window2022-05 to 2026-05
Price basisUSD per GPU-hour

Coverage depth is uneven. Consumer GPUs (RTX-class) have the longest continuous history; datacenter SKUs (H100/H200/B200) are broad across providers in the current cross-section but thin historically. We treat this asymmetry explicitly throughout.

03

Cross-Provider Price Dispersion

Taking the latest observed price per provider for each SKU, we measure how widely the same accelerator is priced across the market. Dispersion is large and pervasive: for SKUs offered by at least three providers, the ratio of the priciest to the cheapest quote routinely exceeds an order of magnitude.

100x
RTX-4090 price range

Cheapest vs priciest quote for the same RTX-4090, across 10 providers.

96x
A100 SXM price range

$0.10 to $9.59 per hour across 15 providers for an 80GB SXM A100.

18x
H100 SXM price range

$1.20 to $22.02 per hour across 21 providers for an 80GB SXM H100.

SKUProvidersCheapestMedianPriciestSpreadCV
H100 SXM 80GB21$1.20$2.05$22.0218.3x1.25
B200 SXM 192GB17$2.50$3.50$62.6025.0x1.86
H200 SXM 141GB16$1.45$2.30$13.259.1x0.88
A100 SXM 80GB15$0.10$1.22$9.5996.5x1.26
RTX 5090 32GB13$0.25$0.50$1.204.8x0.49
RTX 4090 24GB10$0.16$0.31$16.00100x2.62
L40S 48GB9$0.60$0.87$12.3120.5x1.71

Cheapest-to-priciest spread, by SKU (multiple)

RTX 4090
100x
A100 SXM
96x
B200 SXM
25x
L40S
20x
H100 SXM
18x
H200 SXM
9x
RTX 5090
5x

The coefficient of variation sits between 0.5 and 2.6 for the most-listed SKUs. Several forces drive this: there is no central exchange for compute, list prices mix on-demand and spot/marketplace terms, regional supply varies, and reliability and interconnect quality differ widely between providers. For a buyer, the practical implication is that naive single-provider procurement leaves large savings on the table; for an analyst, the dispersion itself is a measurable feature of an immature market.

04

The Cost Curve of Compute

On the consumer-GPU segment, where we hold a continuous panel from a stable provider set, rental rates fell steadily over the observation window:

SKUJul 2024Mar 2026Change
RTX 3090 24GB$0.150/hr$0.105/hr-30%
RTX 4090 24GB$0.296/hr$0.240/hr-19%

We flag one honest artifact: in the most recent month the median ticks up, but that reflects a widening of the tracked provider set (more, often pricier, providers enter the panel) rather than a true price increase. Read within a fixed provider set, the trend is a clean, gradual decline, consistent with hardware depreciation and growing supply on older accelerators.

05

Inference Token Economics (Preliminary)

Alongside rental prices we track hosted-inference token pricing. Among the cheapest available models, the output price per million tokens fell from roughly $0.13 in 2024 to the $0.01-$0.03 range in 2025-2026. We label this preliminary: the sample mixes very different model tiers (small open models vs frontier-scale hosted models), so the level is noisy. The direction, a steep decline in the price of the cheapest intelligence, is robust; precise per-tier curves await the inference-provider API coverage described in Limitations.

06

From Prices to Equities: An Honest Null

A natural question is whether this panel predicts the AI-infrastructure equity complex (NVDA, AMD, AVGO, TSM, ASML, CoreWeave, and peers). We built cross-provider price and dispersion features and trained walk-forward models against forward equity returns. Under naive evaluation the backtest looked strong. Under rigorous evaluation it does not survive.

~1.0
Random-shuffle test p-value

Against a label-shuffled null and a naive-momentum baseline, with purged k-fold cross-validation and embargo, the signal's apparent edge disappears. Verdict: not_ready.

We report this deliberately. The headline out-of-sample Sharpe from a naive split is, in our own assessment, almost certainly an artifact of fitting a sustained 2024-2026 AI uptrend rather than genuine predictive content. A tradeable verdict would require (1) twelve-plus months of forward-collected datacenter-SKU history, (2) at least one non-trending regime to test stability, and (3) a demand-side token-throughput signal. The descriptive value of the panel is real today; the alpha claim is not, and we will not pretend otherwise.

07

Limitations

Three limits bound these results. Datacenter-SKU history (H100/H200/B200) is broad in the cross-section but short in time, so the cost-curve analysis leans on consumer GPUs. The provider panel changes composition over time, which shifts medians unless read within a fixed set. And inference token pricing mixes model tiers. Each is a data-coverage gap, addressable with deeper collection, not a flaw in the method.

08

Conclusion

The GPU rental market is fragmented and inefficient, with order-of-magnitude price dispersion for identical accelerators, while the underlying cost of compute deflates steadily. As a descriptive resource the panel is already valuable to compute buyers and infrastructure analysts. As a trading signal it is not yet validated, and we hold it to that honest standard until the data is deep enough to prove otherwise.