Monthly LLM API cost estimate for the EGR / BOLD Rewards platform. Pricing as of May 2026 — sources:
Gemini ·
Claude ·
OpenAI.
Provider quick-set
Quick-set assigns a sensible default model for every scenario. You can still mix and match below.
Scale preset
Scale
%
Other
%
📄 OCR (svc-ocr) $0
💬 NLQ Analytics (admin chat) $0
🎓 AI Trainer $0
%
🎁 Content Recommendations $0
%
🔍 Embeddings (RAG) $0
M
Total monthly
$0/mo
Yearly
$0/yr
Per tenant
$0/mo
Per MAU
$0/mo
Provider head-to-head comparison
Same workload, same scale, every scenario routed to the cheapest reasonable model of each provider.
Breakdown by scenario and model
Scenario
Provider / Model
Requests / mo
Input tokens (M)
Output tokens (M)
Cost / mo ($)
% of total
Assumptions: Tier — Paid Standard (per-provider direct API; no Bedrock / Vertex partner uplift). Context caching at ~75% hit rate on system prompts & schemas — cache reads are ~10× cheaper than fresh input on all three providers. PDF / image input is counted as ~1300 tokens per page. Retry rate inflates all volumes proportionally. Prices reflect the public docs as of May 2026 — verify before quoting:
Gemini,
Claude,
OpenAI.
For embeddings, Anthropic doesn't ship a first-party model — the conventional pairing is Voyage AI (added in the dropdown).
Sensitivity: what drives cost the most
Driver
Current value
×2 → new total
Δ $ / mo
Δ %
If the given driver doubles (all else equal) — how much the monthly cost changes.