17 min read

May 20, 2026

AI Compute and Neocloud Providers 2026: Vendor Comparison

Ten AI compute providers compared on chip access, power footprint, contract terms, and procurement fit. CoreWeave, Crusoe, Lambda, Nebius, Together, Fluidstack, Voltage Park, RunPod, Vast.ai, plus hyperscaler baseline.

AI Compute Neocloud GPU Cloud CoreWeave

We The Flywheel Research & Analysis

Published May 20, 2026

Compute is where most of the money in an AI programme actually goes, but the procurement model is closer to building a power plant than to buying SaaS. Ten providers split across five lanes, none of them with comparable headline pricing, and a binding constraint (energy) that has nothing to do with GPUs. This is the buyer-side guide: how the lanes split, which vendor wins which lane, and the questions to ask before signing a reserved-capacity contract.

Key takeaways

Five lanes — Hyperscaler-managed (AWS, Azure, GCP) for procurement-friendly enterprise; tier-one neocloud (CoreWeave, Crusoe, Lambda, Nebius) for scale and price; full-stack compute-plus-inference (Together) for one-vendor pipelines; spot and marketplace (RunPod, Vast.ai) for research; specialty (Voltage Park, Fluidstack) for niche fit.
What changes in 2026 — Energy and power-purchase agreements are the real constraint. CoreWeave's IPO and multi-year Microsoft anchor contract proved the model; the next eighteen months are a land grab for B200 and GB200 supply.
What buyers underweight — Networking topology and storage tier. InfiniBand fabric quality, NVLink domain size, and parallel-filesystem throughput often matter more than per-GPU hourly rate on training runs above 256 GPUs.
What buyers overweight — Hourly rate. Real cost includes egress, storage, networking, idle premiums on reserved capacity, and the procurement-cycle overhead. Sticker shock between hyperscaler and neocloud narrows by 30 to 50 percent once those are loaded in.

Resources

10 Providers in this comparison

5 Distinct lanes (hyperscaler, tier-one neocloud, full-stack, spot, specialty)

~$50B Estimated 2025 neocloud capex across the named operators (company disclosures and industry estimates)

2–3x Hyperscaler list price for H100 GPU-hour versus tier-one neocloud rate

The five lanes

The lane split is the first cut. Pick the lane and two-thirds of the comparison work disappears.

1. Hyperscaler-managed

AWS, Azure, and GCP. The procurement-friendly default if you already have a master agreement, a sovereign-region requirement, or a compliance perimeter that has to include compute. List price for current-generation GPUs runs 2 to 3x the tier-one neocloud rate. Enterprise discounts, Savings Plans, and integration savings close part of that gap. Pick this lane when the chip-level price-performance is not the only criterion and procurement reality dominates the decision.

2. Tier-one neocloud

CoreWeave, Crusoe, Lambda Labs, Nebius. Purpose-built GPU fleets, non-blocking InfiniBand fabric, parallel filesystems, and a commercial posture that puts reserved capacity ahead of on-demand. The category that emerged to serve the frontier labs and got validated when CoreWeave's IPO held up post-listing. Any training programme above a few hundred GPUs with a procurement team and a real reservation budget belongs here.

3. Full-stack compute plus inference

Together AI sits on its own. The unusual shape is compute plus training plus fine-tuning plus dedicated inference under one vendor, with the broadest OSS-model coverage in the category. Use it when the team wants one relationship across the lifecycle instead of three or four separate ones.

4. Spot and marketplace

RunPod and Vast.ai. Per-second billing, marketplace price discovery, no minimum commitments, and the cheapest hourly rate available. Fine for research, ablations, and bursty inference. A bad idea for any training run that has to finish on a deadline. Treat them as the bottom rung of a tiered strategy, not the layer your production work depends on.

5. Specialty

Fluidstack and Voltage Park. Smaller footprints, narrower fit, real differentiation in their lanes. Fluidstack for mid-scale European deployments where geography matters and the tier-ones are sized too large. Voltage Park for research and non-profit training capacity allocated through a grant model rather than a commercial cycle.

Energy, not silicon, is the constraint

The 2026 story is power, not chips. A modern training campus draws 50 to 200 MW at full utilisation; a single GB200 NVL72 rack pulls around 120kW. Vendors with multi-year power-purchase agreements, behind-the-meter generation, or pre-cleared utility interconnects have inventory they can actually sell. Vendors without those are stuck waiting on interconnect studies that run 18 to 36 months.

Which is why Crusoe scores so well on the energy axis (greenfield siting on stranded power, early-footprint flared-gas deployments) and why CoreWeave's pre-2024 PPA position translated directly into the anchor-tenant contracts that funded the IPO. Some of the late-2024 entrants will struggle to deliver on 2026 commitments for the same reason: chips can be on a truck inside a quarter, but a substation cannot.

Chip generations: what to ask for and when

Four generations matter in 2026: H100, H200, B200, and GB200. The procurement criteria differ across them.

H100 is commodity-priced and broadly available. Fine for most production training and inference where the model fits in 80GB of HBM and the throughput is enough. The hourly rate has fallen from $8 in early 2024 to under $2 across the tier-one neoclouds in early 2026, and is still falling.

H200 adds 141GB of HBM3e, which matters for larger models and for inference batch sizes that push memory. Available across the named providers; still rationed at the largest reserved commitments.

B200 is the Blackwell-architecture successor with significantly higher training throughput and inference efficiency. Supply through 2026 is anchored to a handful of customers, with CoreWeave taking early allocations and the hyperscalers following on their own ramps (Azure ND GB200 v6, AWS P6, GCP A4).

GB200 NVL72 is the rack-scale product (a tray of B200s plus Grace CPUs, NVLink-connected at scale) and is the chip behind the largest 2026 training builds. Reserved capacity only, multi-quarter lead times, and a procurement profile that looks more like buying a data centre than like buying compute.

Scored comparison

The scoring rubric: lane positioning, strongest capability, chip availability across H100 and H200 and B200 and AMD, network fabric, parallel filesystem options, multi-region footprint, pricing model, minimum engagement, and compliance footprint. Eleven axes across ten providers.

Feature	CoreWeave	Crusoe	Lambda Labs	Nebius	Together AI	Fluidstack	Voltage Park	RunPod	Vast.ai	Hyperscalers (AWS / Azure / GCP)
Lane and positioning
Primary lane	Tier-one neocloud; frontier-lab scale	Tier-one neocloud; energy-integrated	Tier-one neocloud; developer-first	Tier-one neocloud; geographic-diversified	Full-stack: compute plus training plus inference	Specialty; mid-scale neocloud	Specialty; mission-driven, grant-allocated training capacity	Spot and marketplace; serverless GPU	Pure marketplace; lowest-cost spot	Hyperscaler-managed; enterprise procurement
Strongest at	Frontier training runs; large reserved contracts	Greenfield power; flared-gas and stranded-energy siting	On-demand and small reserved; developer ergonomics	EU + US footprint; AMS-listed governance	OSS-model fine-tuning; one-vendor pipelines	Mid-scale European deployments	Heavily discounted training for research and non-profit	Serverless inference; per-second billing	Cheapest H100 hour on the market; research workloads	Enterprise compliance, integrated storage, sovereign regions
Chip access and inventory
H100 availability (May 2026)	Abundant; reserved and on-demand	Abundant; reserved	Abundant; on-demand and reserved	Abundant; multi-region	Abundant; pooled across underlying vendors	Available; on-demand and reserved	Available; allocated by research grant model	Available; spot and on-demand	Cheapest market price; mixed inventory quality	Available; on-demand and reserved instances
H200 availability	Available; allocations to anchor tenants	Available; ramping	Limited; waitlist for non-reserved	Available; multi-region ramp	Limited; via underlying partners	Limited	Not the focus	Limited spot pools	Some marketplace listings	GA on Azure ND H200 v5; AWS P5e; GCP A3 Ultra
B200 / GB200 availability	Early allocations; anchor-tenant priority	Coming online late 2026	Reservations open; allocation deferred	Reservations open	Not yet	Not yet	Not yet	Not yet	Not yet	Azure ND GB200 v6; AWS P6; GCP A4 ramp through 2026
AMD MI300X / MI325X	Limited	Limited	Available	Limited	Via partner network	Limited	No	MI300X on RunPod	Sparse listings	Azure ND MI300X v5; GCP HPC pools
Network and storage
InfiniBand fabric	Full non-blocking 3.2Tbps NDR	Full non-blocking NDR	Non-blocking NDR on reserved clusters	Full non-blocking NDR	Inherits from underlying partners	Non-blocking NDR	Non-blocking NDR	Per-pod; not cross-pod	Ethernet only	EFA / IB / NVLink Switch by SKU
Parallel filesystem options	VAST Data, Lustre, WEKA	VAST Data, WEKA	Lustre, WEKA	Lustre, NVMe pools	Inherits from underlying partners	WEKA, NVMe	Lustre	S3-compatible only	Local NVMe + S3-compatible	FSx for Lustre / Azure Managed Lustre / GCS
Multi-region	US + EU primary, expanding	Multi-region US, EU coming	US primary	EU primary; US and Israel	US + EU via partners	US + EU + UK	US single-region	Multi-region	Global but uneven	Global region map; sovereign regions for EU and AU
Procurement and contracts
Pricing model	Sales-led; reserved-capacity-first	Sales-led	Public on-demand + sales-led reserved	Sales-led + listed on-demand	Public per-token and per-GPU-hour	Sales-led	Grant / non-profit allocation	Public per-second pricing	Public marketplace bidding	Public on-demand; sales-led for reserved / Savings Plans
Minimum engagement	Mid-market to frontier	Mid-market to enterprise	Self-serve to enterprise	Self-serve to enterprise	Self-serve to enterprise	Mid-market	Application-based	Self-serve, no minimum	Self-serve, no minimum	Self-serve to global enterprise
Compliance footprint	SOC 2, HIPAA, ISO 27001	SOC 2, ISO 27001	SOC 2	SOC 2, ISO 27001, EU GDPR posture	SOC 2, HIPAA	SOC 2	Limited	SOC 2	Marketplace; per-host varies	FedRAMP / IL5 / sovereign; full enterprise stack

Included Partial Not included Hover for details

The verdict by lane

Same data, organised by lane and recommendation. Most production AI programmes end up with two relationships: a tier-one neocloud for training reservations, and a hyperscaler for the parts of the pipeline that have to sit inside the enterprise compliance perimeter. A third relationship (marketplace for research) is common but optional.

Recommended for frontier training and large reserved contracts

CoreWeave. The default for any project at frontier-lab scale. Multi-year anchor-tenant contracts with Microsoft and OpenAI validated the model and the IPO funded the next round of build. Best-in-class InfiniBand fabric, the deepest H100 pool in the category, and early B200 allocations. Tax: reserved-capacity-first commercial posture; on-demand exists but is not where the value is.
Crusoe. The energy-arbitrage play. Siting strategy has favoured stranded power and behind-the-meter generation across the early footprint, and the AI-era builds layer in long-dated power contracts on top. The mix delivers lower marginal cost and stronger PUE than most peers. Strong on-prem-style cluster ergonomics. Tax: smaller footprint than CoreWeave; commercial team is leaner; reserved capacity is the route in.
Nebius. The geographic-diversified pick. Strong EU presence with US and Israel build-outs, AMS-listed governance, and a multi-region story that matters for data-residency-bound workloads. Engineering culture inherited from Yandex's research compute. Tax: smaller US footprint than CoreWeave or Lambda; some procurement teams need extra cycles to underwrite the parent entity.

Recommended for mid-scale and developer-driven teams

Lambda Labs. The developer-first pick. Strong on-demand pricing, clean dashboard, the cleanest path from sign-up to a training run that anyone running 8 to 256 GPUs will recognise. Public catalogue, transparent pricing, and the deepest community among the named providers. Tax: US-primary footprint; reserved capacity for the very largest runs goes to the tier-one neoclouds first.
Together AI. The full-stack pick. Compute plus model hosting plus fine-tuning plus dedicated inference endpoints from one vendor, with the strongest OSS-model coverage. Right when the training and serving pipelines should not be split across two vendors. Tax: the compute layer is pooled across underlying partners, which means networking topology varies; ask explicitly about the cluster you will get.
Fluidstack. The mid-scale European option. UK and EU footprint, mid-market commercial posture, and a track record of delivering 32 to 256 GPU clusters with non-blocking InfiniBand at reasonable lead times. Tax: smaller than the tier-one neoclouds; capacity at the very high end can be paced.

Recommended for spot, research, and bursty workloads

RunPod. The serverless option with the cleanest developer ergonomics in this lane. Per-second billing, AMD MI300X access alongside NVIDIA, fast cold-starts, and an inference path that scales to zero. Right for research, ablations, and bursty inference where reserved capacity would sit idle. Tax: not the right shape for week-long training runs that need a stable cluster.
Vast.ai. The cheapest H100 hour on the market, full stop. Pure marketplace model surfaces consumer GPUs alongside datacentre listings; the price discovery is genuine and the savings are real for the right workload. Tax: per-host variance is high, networking is Ethernet-only, and the marketplace shape means production-grade SLAs are not the product.
Voltage Park. The mission-driven research pick. Funded by Jed McCaleb's Navigation Fund and structured to allocate heavily-discounted training capacity through an application-and-grant process rather than a commercial sales cycle. Right for academic and high-impact research workloads. Tax: not a commercial procurement route; expect an application and review cycle rather than an SOW.

Recommended for enterprise procurement and compliance

Hyperscalers (AWS, Azure, GCP). The procurement-friendly default. Existing master agreements, sovereign-region coverage, FedRAMP / IL5 footprint, integrated storage and networking, and a single bill that covers compute alongside everything else. Right when the organisation cannot or will not stand up a second cloud-vendor relationship. Tax: list price for current-generation GPUs runs 2 to 3x the tier-one neocloud rate; some of that gap closes once enterprise discounts, Savings Plans, and integration costs are loaded in.

The six-step procurement playbook

The mechanics that separate working procurement from the deck-led version most teams settle for.

Specify the workload before the first sales call. Model class and size, training horizon, batch shape, parallelism strategy, expected utilisation. Without these, every vendor anchors the conversation to a generic cluster spec and the negotiation runs on the wrong dimensions.
Shortlist three providers per lane. Not five and not one. Three forces real differentiation; three preserves negotiating leverage on the production contract.
Benchmark on a representative job. Run a short, identical, paid benchmark across the shortlist. The metrics that matter are tokens-per-second per GPU, all-reduce bandwidth at the parallelism shape you actually use, and storage-side throughput from a realistic training-data layout. Sales decks are not predictive of production behaviour.
Validate the network topology. Ask for the InfiniBand topology diagram, the NVLink-domain layout, and a real-world all-reduce result at the cluster size you plan to use. Vendors who answer cleanly have mature infrastructure. Vendors who deflect inherit topology from underlying partners and cannot tell you what you will actually get.
Negotiate around the right axes. Hourly rate is one. The others are storage, networking egress, idle premium on reserved capacity, ramp schedule, and the off-ramp clause for the back end of the contract. The off-ramp is the most-skipped axis and the one that hurts most when it is missing.
Build the tiered strategy explicitly. A tier-one neocloud reservation for training, a hyperscaler relationship for the compliance-bound parts of the pipeline, and a marketplace or spot account for research and ablations. Document which workload goes where and why, so the next planning cycle does not relitigate the same decisions.

When to combine providers

Production AI programmes converge on multi-provider strategies. The combinations that work in practice:

CoreWeave or Nebius for training reservations + AWS or Azure for the enterprise envelope. The frontier-plus-enterprise pattern. Reserved neocloud capacity for the training run, hyperscaler footprint for storage, identity, and the compliance-bound surfaces.
Lambda or Crusoe for mid-scale training + RunPod for research and bursty inference. Mid-scale-plus-spot. Right for teams running 64 to 512 GPU training jobs without a frontier-scale reservation, with research and ablation work routed to per-second billing.
Together AI for end-to-end OSS-model work. The one-vendor pattern. Compute plus fine-tuning plus dedicated inference endpoints, with the strongest OSS-model coverage among the named providers. Right when splitting the pipeline across vendors costs more in integration than it saves on per-GPU-hour.
Voltage Park for grant-allocated research + a commercial provider for the production work. Research-plus-production. Common shape in academic labs and AI-for-science programmes where part of the work qualifies for grant-allocated capacity and part of it does not.

Field evidence

Frequently asked questions

What is a neocloud?

A purpose-built cloud provider optimised for AI workloads, primarily large-scale GPU training and high-throughput inference. The defining traits in 2026 are NVIDIA-aligned fleets at scale, full non-blocking InfiniBand fabric, parallel filesystems for training data, and multi-year reserved-capacity contracts with anchor tenants. CoreWeave, Crusoe, Lambda Labs, and Nebius are the canonical examples. The neocloud category emerged from the gap between hyperscaler list prices and the unit economics that frontier-lab training requires, and was validated by CoreWeave's 2025 IPO.

How do CoreWeave and Lambda Labs compare?

Both are tier-one neoclouds but they serve different jobs. CoreWeave is reserved-capacity-first, with multi-year anchor-tenant contracts (Microsoft, OpenAI) shaping the commercial posture and an InfiniBand fabric built for frontier training runs. Lambda is developer-first, with public catalogue pricing, clean on-demand ergonomics, and the deepest community of practitioners running 8 to 256 GPU jobs. If the project is frontier-scale with a procurement team and a reserved budget, CoreWeave. If the project is developer-led, mid-scale, and benefits from self-serve, Lambda.

Is a neocloud cheaper than a hyperscaler?

On per-GPU-hour list price for current-generation NVIDIA chips, neoclouds run roughly 2 to 3x cheaper than hyperscaler list rates. The real comparison loads in storage, networking egress, integration cost, procurement-cycle overhead, and the enterprise discounts that hyperscalers offer on large commitments. Net of all of those, the savings on a fully-loaded multi-year contract typically end up in the 30 to 60 percent range rather than 200 percent, but the gap is still material on any workload where compute dominates total cost.

What is the difference between an H100, an H200, a B200, and a GB200?

H100 is NVIDIA's previous-generation training and inference GPU, shipping in volume since late 2022 and now effectively commodity-priced across the neocloud market. H200 is a memory-upgraded H100 with 141GB of HBM3e, shipping since mid-2024, still rationed at the highest tiers. B200 is the Blackwell-architecture successor with significantly higher training throughput and inference efficiency; supply through 2026 is anchored to a handful of customers. GB200 is the rack-scale Blackwell product (a tray of B200s plus Grace CPUs, NVLink-connected at scale) and is the chip behind the largest 2026 training builds. Procurement priority should match model size and training horizon: H100 for most production work, H200 for inference of larger models, B200 / GB200 reserved capacity for frontier training programmes that need it.

Why is energy the binding constraint?

The gating factor on 2026 capacity is not GPU supply, it is the megawatts to run them. A modern training campus draws 50 to 200 MW at full utilisation; a single GB200 NVL72 rack draws around 120kW. Power-purchase agreements, grid interconnects, and substation capacity are now multi-year projects that have to be in motion years before the chips arrive. Vendors who locked up power in 2023 and 2024 own 2026's capacity; vendors who tried to lock it up in late 2025 are queued behind utility interconnect studies. Energy strategy is the real moat in this market, which is why Crusoe (stranded gas, behind-the-meter renewables) and CoreWeave (long-dated PPAs) score so highly on it.

When should I use a marketplace provider like Vast.ai or RunPod?

For research, ablations, bursty inference, and any workload where the cost-per-hour matters more than the cluster-quality guarantees. Both deliver real savings on suitable workloads. Neither is the right shape for a week-long training run that has to finish on a deadline, because the underlying inventory shifts and the SLA posture does not match production training requirements. Use them as the bottom rung of a tiered compute strategy: marketplace for experimentation, tier-one neocloud reservations for training, hyperscaler for the parts of the pipeline that have to live inside the enterprise compliance perimeter.

How do I evaluate networking topology before signing?

Ask three specific questions. First, what is the fabric (NVIDIA NDR InfiniBand, Spectrum-X Ethernet, AWS EFA, GCP Jupiter, Azure HPC Ethernet) and is it non-blocking inside a pod? Second, what is the NVLink-domain size (eight GPUs for standard HGX, seventy-two for NVL72) and how does cross-domain traffic route? Third, what is the storage-side throughput (parallel filesystem, NVMe pool, S3-compatible) and what is the peak read bandwidth from a 128-GPU job? Vendors who answer cleanly across all three are the ones with mature training infrastructure. Vendors who deflect on any of the three are usually inheriting topology from an underlying partner.

How long is the procurement cycle?

Self-serve on-demand is minutes for Lambda, RunPod, Vast.ai, and the hyperscaler short-form SKUs. Reserved-capacity contracts at tier-one neoclouds run 4 to 12 weeks of negotiation for a first contract, faster on renewals. Hyperscaler enterprise commitments run 6 to 16 weeks. Multi-year frontier contracts with custom power, custom networking, or custom data-centre build run 6 to 18 months. Procurement-cycle length is itself a procurement criterion; pick the lane whose cycle matches the project timeline.

Key takeaways

The five lanes

1. Hyperscaler-managed

2. Tier-one neocloud

3. Full-stack compute plus inference

4. Spot and marketplace

5. Specialty

Energy, not silicon, is the constraint

Chip generations: what to ask for and when

Scored comparison

The verdict by lane

Recommended for frontier training and large reserved contracts

Recommended for mid-scale and developer-driven teams

Recommended for spot, research, and bursty workloads

Recommended for enterprise procurement and compliance

The six-step procurement playbook

When to combine providers

Field evidence

CTAIO Labs

Related reads

AI Inference Platforms 2026

AI Training Data Providers 2026

Best Agent Orchestration Frameworks 2026

Enterprise AI Agent Platforms 2026

Agentic Search

Frequently asked questions

What is a neocloud?

How do CoreWeave and Lambda Labs compare?

Is a neocloud cheaper than a hyperscaler?

What is the difference between an H100, an H200, a B200, and a GB200?

Why is energy the binding constraint?

When should I use a marketplace provider like Vast.ai or RunPod?

How do I evaluate networking topology before signing?

How long is the procurement cycle?

What is a neocloud?

How do CoreWeave and Lambda Labs compare?

Is a neocloud cheaper than a hyperscaler?

What is the difference between an H100, an H200, a B200, and a GB200?

Why is energy the binding constraint?

When should I use a marketplace provider like Vast.ai or RunPod?

How do I evaluate networking topology before signing?

How long is the procurement cycle?

Ready to Find the Right AI Tools?

Continue Reading