How does NumStack calculate AI token costs?

NumStack uses real published pricing from OpenAI, Anthropic, Google, and other AI providers. Enter your typical usage (input tokens, output tokens, requests per day) to see projected monthly costs across all models.

Why do different AI models have different token prices?

Different models have different computational costs. Smaller models like GPT-4o mini or Haiku are cheaper but less capable. Larger models like GPT-4o or Claude Sonnet are more powerful but cost more per token.

How can I reduce my AI API costs?

Use smaller models for simple tasks, cache repeated prompts, trim output length (output tokens cost 3-5x more than input), and batch requests instead of real-time when latency allows.

← Back to NumStack

Prompt Optimizer

Reduce token usage without losing quality. Paste your prompt, choose your model, and get 3 optimized variants instantly.

Why prompt optimization saves real money

Input tokens are often overlooked in AI cost calculations — teams focus on output length but forget that long system prompts and verbose user instructions add up at scale. At 10,000 requests/day, even a 20% token reduction on a medium-sized prompt can save hundreds of dollars per month.

The Prompt Optimizer identifies the most common token-wasting patterns: filler phrases like "please can you", redundant qualifiers, duplicate sentences, and verbose vocabulary. It produces three variants so you can choose the right trade-off for your use case.

Three optimization strategies

🔵Remove Redundancy — Strips filler openers, redundant qualifiers, and duplicate sentences. Safest option that rarely changes meaning.
🟣Restructure for Clarity — Reorders the prompt to lead with the core task, removes elaboration based on your quality setting. Better for long, multi-paragraph prompts.
🔵Simplify Language — Replaces Latinate and multi-word phrases with direct alternatives. "Utilize" → "use", "in order to" → "to". Typically the highest reduction.

How accurate is the token counter?

The live token counter uses a 4-chars/token approximation, which matches GPT-family tokenization within ±5% for typical English prose. For exact counts, use the OpenAI tokenizer or the OpenAI Tokenizer playground. The approximation is accurate enough for cost estimates and optimization comparisons.

🔌 Want to automate this? Use the NumStack API to optimize prompts programmatically: POST /api/calculators/prompt-optimizer

Prompt Optimizer

Your Prompt

Why prompt optimization saves real money

Three optimization strategies

How accurate is the token counter?