TCO Calculator Expert
Compare Cloud API, Cloud GPU Rental, and On-Premise deployments with real-time hardware validation and cost breakdown. Built for senior architects making production decisions.
How to use
Run the comparison in three passes
-
01
Lock the workload first.
Set queries/month + token mix on the Cloud API tab. The values auto-sync to Cloud GPU and On-Premise so every scenario uses the same demand curve.
-
02
Pick the model pair.
Choosing a provider automatically selects the equivalent open-source model and GPU requirements. Reverse sync buttons keep API ⇄ Cloud GPU ⇄ On-Prem aligned.
-
03
Validate TCO + VRAM.
Use the sticky bar for 3-year totals, then open the VRAM + power widgets to ensure the GPU count/quantization is production-safe before exporting the report.
Quick profiles
Jump-start with a persona checklist
LLM-based knowledge assistant, 500K queries/month.
- Stay on Claude 3.7 Sonnet ↔ Llama 3.3 70B FP8.
- Toggle Security filter on the architecture page for context.
- Compare Cloud GPU H100 commit vs dual H200 racks.
Streaming telemetry, 1.2M queries/month.
- Increase queries + concurrency, pick GPT-4o ↔ DeepSeek 67B.
- Use GPU auto-suggestion to size 4× H100 or MI300X.
- Record throughput widget values in the session notes.
80K queries/month, strict sovereignty.
- Select Air-Gapped scenario on the architecture page first.
- Switch API tab to Gemini 1.5 Pro ↔ Qwen 32B.
- Keep On-Prem GPU discount at 0 and bump opex multipliers.
Cloud API Configuration
📊 Workload
💰 Pricing
💵 Cost Summary
Cloud GPU Configuration
🤖 Model Selection
- Model weights: 70 GB
- KV cache: 18 GB
- Safety margin (20%): 18 GB
🖥️ Hardware
💾 Storage & Networking
📊 Workload
- Throughput: 480 tokens/sec
- Max queries/hour: 1,570
- GPU utilization: 87%
💵 Cost Summary
On-Premise Configuration
🤖 Model Selection
- Model weights: 70 GB
- KV cache: 18 GB
- Safety margin (20%): 18 GB
🖥️ Hardware Capex
⚡ Power & Cooling
Total TDP: 0W × PUE × hours
🔧 Operational Costs (Annual)
💵 Cost Summary
📈 3-Year TCO Comparison
| Metric | Cloud API | Cloud GPU | On-Premise |
|---|---|---|---|
| Monthly Cost | €0 | €0 | €0 |
| 3-Year TCO | €0 | €0 | €0 |
| Breakeven vs API | - | - | - |
| ROI over 3 years | - | - | - |