[HOPPER]
NVIDIA H200
Hopper, refreshed — 76% more memory than H100.
The NVIDIA H200 is the memory-upgraded refresh of the Hopper architecture, with 141 GB HBM3e and 4.8 TB/s bandwidth. It's the practical sweet spot for production LLM inference and 70B-class fine-tuning where context length and KV cache dominate.
VRAM
141 GB HBM3e
FP8 TFLOPS
1,979
Mem BW
4.8 TB/s
TDP
700 W
Best for
- Long-context LLM inference
- 70B–405B fine-tuning
- RAG + agent serving at scale
Benchmarks
| WORKLOAD | METRIC | VALUE |
|---|---|---|
| Llama-3 70B inference (FP8) | tokens/sec | ~2,400 |
| Mixtral 8x22B inference | tokens/sec | ~1,900 |
| FP8 dense matmul | TFLOPS | 1,979 |
NVIDIA H200 availability from $32.60/hr
NVIDIA Hopper
8x NVIDIA H200
Memory1128GB HBM
RegionUS-East-1
Availability71%
ProviderAWS
$32.60/hr
Deploy →Definition
What is Servers.Computer?
Servers.Computer is an AI compute routing and procurement layer that benchmarks, compares, and deploys GPU clusters (NVIDIA H100, H200, B200 and AMD MI300) across global cloud providers in real time.