[HOPPER]

NVIDIA H200

Hopper, refreshed — 76% more memory than H100.

The NVIDIA H200 is the memory-upgraded refresh of the Hopper architecture, with 141 GB HBM3e and 4.8 TB/s bandwidth. It's the practical sweet spot for production LLM inference and 70B-class fine-tuning where context length and KV cache dominate.

VRAM
141 GB HBM3e
FP8 TFLOPS
1,979
Mem BW
4.8 TB/s
TDP
700 W

Best for

  • Long-context LLM inference
  • 70B–405B fine-tuning
  • RAG + agent serving at scale

Benchmarks

WORKLOADMETRICVALUE
Llama-3 70B inference (FP8)tokens/sec~2,400
Mixtral 8x22B inferencetokens/sec~1,900
FP8 dense matmulTFLOPS1,979

NVIDIA H200 availability from $32.60/hr

NVIDIA Hopper

8x NVIDIA H200

Memory1128GB HBM
RegionUS-East-1
Availability71%
ProviderAWS
$32.60/hr
Deploy →
Definition

What is Servers.Computer?

Servers.Computer is an AI compute routing and procurement layer that benchmarks, compares, and deploys GPU clusters (NVIDIA H100, H200, B200 and AMD MI300) across global cloud providers in real time.