Major LLM API Pricing Comparison [July 2026]

We reference each provider's official pricing page directly and list only figures that could be re-verified from that page at the time of writing.
Models or figures we could not verify are omitted from the tables; please refer to the official pricing page instead.
All prices are in USD / 1M tokens. Use our Token Cost Calculator to estimate costs from your own token counts.

● Last updated: July 5, 2026 | Source retrieval: Anthropic 2026-07-03 JST / Google 2026-07-04 JST / OpenAI 2026-07-03 JST (screenshot fact) / DeepSeek 2026-07-05 JST | Anthropic · Google · OpenAI · DeepSeek

OpenAI

📋 Conditions: Standard processing rate for context length under 270K tokens

⚠️ The openai.com/developers pages could not be re-verified via web search due to bot protection. We instead reference a screenshot of the official page obtained by okamo (view screenshot).

OpenAI API pricing (USD / 1M tokens) — from developers.openai.com/api/docs/pricing
Model	Input	Cached Input	Output	Notes
GPT-5.5 New	$5.00	$0.50	$30.00	Flagship model. Standard rate for context length under 270K
GPT-5.4	$2.50	$0.25	$15.00	Balanced model. Standard rate for context length under 270K
GPT-5.4 mini	$0.75	$0.075	$4.50	Lightweight, low-cost tier. Standard rate for context length under 270K

For other models (GPT-5 nano, GPT-5.4 Pro, etc.), see the official pricing page.

Primary source: developers.openai.com/api/docs/pricing / Screenshot: captured 2026-07-03 (retrieved 2026-07-03 JST. Web-search re-verification was not possible due to bot protection; the screenshot-fact exception rule approved by okamo applies)

Anthropic (Claude)

📋 Conditions: Standard API (direct first-party API) / text input / standard rate without cache

Anthropic Claude API pricing (USD / 1M tokens) — from platform.claude.com/docs/en/about-claude/pricing
Model	Input (Base Input)	Cache Write 5m	Cache Hit	Output	Notes
Claude Fable 5 New	$10.00	$12.50	$1.00	$50.00	Released 2026-06-09. Uses a new tokenizer (see note below). Flat standard rate across the full 1M-token context (no long-context premium)
Claude Sonnet 5 New	$2.00 until 2026-08-31 $3.00 (from 2026-09-01)	$2.50 $3.75 (from 2026-09-01)	$0.20 $0.30 (from 2026-09-01)	$10.00 until 2026-08-31 $15.00 (from 2026-09-01)	Released 2026-06-30. Opus 4.7+ Opus models, Fable 5, Mythos 5, Mythos Preview and Sonnet 5 use a new tokenizer. The same text can yield roughly 1.0–1.35x the token count depending on content type
Claude Opus 4.8	$5.00	$6.25	$0.50	$25.00	Top-tier flagship model
Claude Sonnet 4.6	$3.00	$3.75	$0.30	$15.00	Balanced model. Flat rate across the full 1M context
Claude Haiku 4.5	$1.00	$1.25	$0.10	$5.00	For high-volume processing and classification tasks

For other models (Opus 4.7 and earlier, Haiku 3.5, Mythos 5, etc.), see the official pricing page.

Primary source: platform.claude.com/docs/en/about-claude/pricing (retrieved 2026-07-03 JST)
※ Fable 5 cache pricing matches Anthropic's published multipliers relative to base input (5m write 1.25x, cache hit 0.1x)

Official notes we could directly confirm

Batch API (Message Batches API) gives a 50% discount across all models
Cache write (5 min) = 1.25x standard input; cache write (1 hour) = 2x
Cache hit = 0.1x standard input (90% discount)
Fable 5, Mythos 5, Opus 4.8, Sonnet 5 and Sonnet 4.6 use a flat rate across the full 1M-token context (no long-context premium)
Opus 4.7+ Opus models, Fable 5, Mythos 5, Mythos Preview and Sonnet 5 use a new tokenizer. The same text can yield roughly 1.0–1.35x the token count depending on content type (per platform.claude.com/docs/en/about-claude/pricing and anthropic.com/news/claude-opus-4-7)
US-only data residency adds a 1.1x multiplier to standard pricing

Google (Gemini Developer API)

📋 Conditions: Gemini Developer API paid tier / Standard tier / text, image and video input / standard API

✅ Re-verified via web search against the current display on ai.google.dev/gemini-api/docs/pricing (Gemini 3.5 Flash GA, Gemini 3.1 Pro/Flash-Lite pricing). A screenshot obtained by okamo is also listed as a source (view screenshot).

Google Gemini API pricing (USD / 1M tokens) — from ai.google.dev/gemini-api/docs/pricing
Model	Input (text/image/video)	Input (prompt >200K)	Output	Output (prompt >200K)	Notes
Gemini 3.5 Flash New	$1.50	—	$9.00	—	GA since May 2026. Built for agentic/coding use cases. Context cache read $0.15
Gemini 3.1 Pro Preview Preview	$2.00	$4.00	$12.00	$18.00	Preview pricing, subject to change
Gemini 3.1 Flash-Lite	$0.25	—	$1.50	—	Audio input $0.50/1M. Cost-focused fast model
Gemini 2.5 Pro	$1.25	$2.50	$10.00	$15.00
Gemini 2.5 Flash	$0.30	—	$2.50	—	Audio input $1.00/1M
Gemini 2.5 Flash-Lite	$0.10	—	$0.40	—	Audio input $0.30/1M

For other models (Gemini 3 Flash, etc.), see the official pricing page.

Primary source: ai.google.dev/gemini-api/docs/pricing / Screenshot: captured 2026-07-03 (retrieved 2026-07-04 JST). Preview pricing, free-tier terms and per-modality rates may change — always check the official page before contracting.

Official notes we could directly confirm

Gemini 3.5 Flash reached GA in May 2026. Reported to outperform 3.1 Pro on some coding/agentic benchmarks
Gemini 3.1 Pro Preview is a preview model; pricing and specs may change
Batch API usage gives roughly a 50% discount (all models)
Context caching has separate read pricing per model (see official pricing page)
Audio input is priced higher than text/image/video input
On paid tiers, prompts are not used to improve Google's products (free tier prompts may be used)
Gemini 2.0 Flash-Lite was retired on June 1, 2026 (per official page)

DeepSeek

📋 Conditions: DeepSeek API standard pricing (current promotional pricing). Input rate differs between cache miss (first pass) and cache hit (reuse)

DeepSeek API pricing (USD / 1M tokens) — from api-docs.deepseek.com/quick_start/pricing
Model	Input (cache miss)	Input (cache hit)	Output	Notes
DeepSeek V4 Pro New	$0.435	$0.003625	$0.87	1M context, max output 384K. Current promotional pricing (standard pricing $1.74 / $0.0145 / $3.48)
DeepSeek V4 Flash	$0.14	$0.0028	$0.28	1M context, max output 384K. Lightweight, high-speed tier

For setup steps, real-world costs and a Claude Sonnet comparison in VS Code GitHub Copilot Chat, see our DeepSeek V4 Pro practical guide (Japanese).

Primary source: api-docs.deepseek.com/quick_start/pricing (re-verified via web search, retrieved 2026-07-05 JST)

🔍 Spotlight model

DeepSeek V4 Pro — Sonnet-tier benchmarks at a fraction of the price

Per the official pricing page (api-docs.deepseek.com/quick_start/pricing, retrieved 2026-07-05 JST), input is $0.435 / output is $0.87 per 1M tokens (current promotional pricing; standard pricing is $1.74 / $3.48). Our operator (okamoちゃんねる) has used it directly in VS Code GitHub Copilot Chat and reports that for text-centric coding tasks it feels close to Claude Sonnet in practice, at a much lower cost. It does not support Vision (image input), which is where Sonnet still has the edge. See our DeepSeek V4 Pro practical guide (Japanese) for setup steps and measured data, or use the Token Calculator to estimate cost differences vs. other models.

💡 Cost-Saving Basics

Three fundamental strategies to significantly reduce API costs with proper optimization

⚡ Batch API (asynchronous)

Up to 50% OFF

For tasks that don't need a real-time response (batch evaluation, data processing, summarization), the Batch API gives roughly 50% off across all models. Confirmed on official pages for both Anthropic and Google.

🗃️ Prompt caching

Up to 90% off cached input reads

Putting shared system prompts or documents into cache substantially discounts input cost on repeated calls. Anthropic offers up to 90% off (confirmed on official page). Especially effective for use cases with a long fixed prefix.

🔀 Model routing

Match model tier to task difficulty

Route simple classification/extraction tasks to lightweight models (Haiku 4.5, Flash-Lite, GPT-5.4 mini) and complex reasoning to flagship models to keep quality high at lower cost. Use the Token Calculator to estimate the cost difference.

⚠️ Disclaimer

Pricing on this page references each provider's official page directly and lists only figures we could re-verify (retrieved: Anthropic 2026-07-03 JST / Google 2026-07-04 JST / OpenAI 2026-07-03 JST via screenshot / DeepSeek 2026-07-05 JST).
Pricing may change without notice. Always check the official pricing page before contracting or production use.
Preview, limited-time pricing and beta features may change or be discontinued.
This site is for informational purposes only and does not endorse or represent any specific provider.