All Tools
VRAM Calculator
Calculate exact VRAM needed for any model, quantization, and context length.
GGUF Size Estimator
soonEstimate GGUF file size before downloading, by quant level.
Inference Time Estimator
soonEstimate response time from tokens/sec and output length.
Context Window Cost Calculator
soonCompare local vs API cost across context lengths and providers.
Chat Template Formatter
soonWrap raw prompts in ChatML, Llama 3, Alpaca, or Qwen chat templates.
Token Counter
soonCount tokens across multiple tokenizers side by side.
Quant Format Picker
soonGet a quantization recommendation based on your VRAM and priorities.
System Prompt Token Budget
soonSee how much context window your system prompt consumes.