LLM Models

Compare pricing and specifications for large language models from all major providers.

All Models Models

OpenAI GPT-5.5

2026-04 • 512K context window

OpenAI's April 2026 flagship: GPT-5.5 ships materially better long-context reasoning than 5.4, a 512K default context window, doubled multimodal capability, and lower API pricing. Targets premium production workloads where 5.4 was leaving capability on the table.

Input: $4.0000 / 1K tokens

Output: $16.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.7

2026-04 • 1M context window

Anthropic's April 2026 flagship. Drops the long-context premium SKU — 1M token context is the default tier. ~30% lower median latency than 4.6 on long-context requests, materially better long-context retrieval (96.4% at 1M vs 91% for 4.6), and adds a `thinking_budget_tokens` parameter for explicit cost control on extended reasoning. Released alongside the easing of the March 2026 peak-hour Pro/Max throttle.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

GPT-5.4

2026-03 • 256K context window

OpenAI's most advanced model with enhanced reasoning, longer context, and multimodal capabilities. Top of the line for complex tasks.

Input: $25.0000 / 1K tokens

Output: $100.0000 / 1K tokens

View model details

GPT-5.4 Mini

2026-03 • 128K context window

Affordable version of GPT-5.4 with strong performance for everyday tasks at a fraction of the cost.

Input: $3.0000 / 1K tokens

Output: $12.0000 / 1K tokens

View model details

Gemini 3.1 Pro

2026-03 • 2M context window

Google's latest Gemini 3.1 Pro with 2M context window, improved code understanding, and enhanced factual accuracy.

Input: $4.0000 / 1K tokens

Output: $16.0000 / 1K tokens

View model details

Gemini 3.1 Flash

2026-03 • 1M context window

Fast and affordable Gemini 3.1 Flash optimized for high-throughput applications at minimal cost.

Input: $0.3500 / 1K tokens

Output: $1.4000 / 1K tokens

View model details

xAI Grok 3.5 (256k)

2026-03 • 256K context window

xAI's latest flagship model with expanded context, improved reasoning, and deeper integration with real-time data from X platform.

Input: $5.0000 / 1K tokens

Output: $25.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.6 (1M)

2026-02 • 1M context window

Anthropic's Claude Opus 4.6 with extended 1M token context window for processing entire codebases, books, and massive datasets in a single prompt.

Input: $18.0000 / 1K tokens

Output: $90.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.6 (200k)

2026-02 • 200K context window

Anthropic's latest flagship model with best-in-class reasoning, coding, and agentic capabilities. Supports extended thinking for complex multi-step problems.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4.6 (200k)

2026-02 • 200K context window

Anthropic's balanced mid-tier model offering strong intelligence, speed, and cost-effectiveness. Excellent for everyday coding and enterprise tasks.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

DeepSeek V4 (128k)

2026-02 • 128K context window

DeepSeek's fourth-generation open-weights model with state-of-the-art reasoning at remarkably low cost. Strong performance on coding and math benchmarks.

Input: $0.5000 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

OpenAI o3 Deep Research (200k)

2026-01 • 200K context window

An o3-family model optimized for deep research tasks. Autonomously browses the web, synthesizes information, and produces comprehensive research reports.

Input: $20.0000 / 1K tokens

Output: $80.0000 / 1K tokens

View model details

Mistral Large 3 (128k)

2026-01 • 128K context window

Mistral's latest flagship model with multilingual excellence, strong coding, and enterprise-grade function calling. Open-weight model.

Input: $2.0000 / 1K tokens

Output: $6.0000 / 1K tokens

View model details

Alibaba Qwen 3 (128k)

2026-01 • 128K context window

Alibaba's latest Qwen 3 flagship model with strong multilingual capabilities and improved reasoning. Competitive with frontier models at lower cost.

Input: $0.8000 / 1K tokens

Output: $3.2000 / 1K tokens

View model details

Midjourney v7

2026-01 • N/A context window

Professional AI image generation with photorealistic quality and artistic control. Subscription-based: Basic $10/mo, Standard $30/mo, Pro $60/mo

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-5.2 Pro (400k)

2025-12 • 400K context window

Higher-compute variant of GPT-5.2 for harder problems (Responses API only).

Input: $21.0000 / 1K tokens

Output: $168.0000 / 1K tokens

View model details

Google Gemini 3 Flash Preview (1M)

2025-12 • 1M context window

Gemini 3 Flash Preview on Vertex AI.

Input: $0.5000 / 1K tokens

Output: $3.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.5 (200k)

2025-11 • 200K context window

Claude 4.5 flagship Opus model for long-horizon coding and agentic workflows.

Input: $5.0000 / 1K tokens

Output: $25.0000 / 1K tokens

View model details

Google Gemini 3 Pro Preview (1M)

2025-11 • 1M context window

Gemini 3 Pro Preview on Vertex AI.

Input: $2.0000 / 1K tokens

Output: $12.0000 / 1K tokens

View model details

Anthropic Claude Haiku 4.5 (200k)

2025-10 • 200K context window

Anthropic's fastest 4.5-series model�cost-effective for high-volume workloads with extended thinking and strong tool use.

Input: $1.0000 / 1K tokens

Output: $5.0000 / 1K tokens

View model details

Cognition SWE-1.5 (Windsurf in-house)

2025-10 • Unknown context window

Windsurf/Cognition in-house frontier model for agentic coding. Consumed via Windsurf prompt credits (not USD per-token).

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude Haiku 4.5 (200k)

2025-10 • 200K context window

Anthropic's fastest and most cost-effective 4.5-series model, optimized for high-volume workloads with strong tool use capabilities.

Input: $0.8000 / 1K tokens

Output: $4.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4.5 (200k)

2025-09-30 • 200K context window

Anthropic's latest and most advanced Sonnet model released September 30, 2025. Features dramatically improved coding capabilities, enhanced reasoning, and better long-context performance. Outperforms Claude 4 Sonnet on all major benchmarks while maintaining cost efficiency.

Input: $3.5000 / 1K tokens

Output: $17.5000 / 1K tokens

View model details

OpenAI GPT-5-Codex (400k)

2025-09 • 400K context window

A version of GPT-5 optimized for agentic coding in Codex. Default in Codex cloud & reviews; also usable via API key. Priced the same as GPT-5.

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4.5 (200k/1M beta)

2025-09 • 1M context window

Anthropic's most intelligent model for agents and coding, with extended thinking and state-of-the-art performance on SWE-bench Verified.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Google Gemini 3.0 Ultra (2M)

2025-09 • 2M context window

Google's next-generation flagship model with breakthrough multimodal capabilities and 2M token context window. Features advanced reasoning, native code execution, and real-time multimodal understanding across text, image, audio, and video.

Input: $0.2000 / 1K tokens

Output: $0.8000 / 1K tokens

View model details

Google Gemini 3.0 (2M)

2025-09 • 2M context window

Next-generation Gemini with advanced multimodal reasoning and long context. Rates pending official pricing page.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.1 (200k)

2025-08 • 200K context window

Opus 4.1 (200k) � higher-cost flagship Opus tier. Newer Opus 4.5 provides a more accessible price point.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

OpenAI GPT-5 Pro (400k)

2025-08 • 400K context window

Version of GPT-5 that produces smarter and more precise responses. Responses API only; higher max output than standard GPT-5.

Input: $15.0000 / 1K tokens

Output: $120.0000 / 1K tokens

View model details

OpenAI GPT-5 (400k)

2025-08 • 400K context window

Previous flagship GPT model for coding, reasoning, and agentic tasks. OpenAI recommends GPT-5.1/5.2 for newest improvements.

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

OpenAI GPT-5 Mini (400k)

2025-08 • 400K context window

A faster, lower-cost GPT-5 for well-defined tasks. Text & vision with long context at a fraction of the price.

Input: $0.2500 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

OpenAI GPT-5 Low (1M)

2025-08 • 1M context window

Alias tier aligned with GPT-5 Mini pricing; use when your workflow targets the "low" cost tier.

Input: $0.2500 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

OpenAI GPT-5 Medium (1M)

2025-08 • 1M context window

Mid-tier GPT-5 variant balancing performance and cost for general-purpose workloads.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-5 High (1M)

2025-08 • 1M context window

High-tier GPT-5 with enhanced reasoning, reliability and coding for mission-critical workloads.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Moonshot Kimi K2 Base (128k)

2025-07 • 128K context window

A 1T parameter open-weight Mixture-of-Experts (MoE) model with 32B active parameters. This is the unaligned, pre-trained base model, suitable for further fine-tuning.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Moonshot Kimi K2 Instruct (128k)

2025-07 • 128K context window

The instruction-tuned version of Kimi K2, optimized for chat, agentic tasks, and tool use. Aligned with RLHF for helpful and safe responses.

Input: $0.1500 / 1K tokens

Output: $2.5000 / 1K tokens

View model details

Alibaba Qwen3 Coder Flash (1M)

2025-07 • 1M context window

A 30B parameter model from the Qwen3 series, excelling in coding and agentic tasks with a 1M token context length.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Zhipu GLM-4.5 (200k)

2025-07 • 200K context window

GLM-4.5 agentic foundation model. Official pricing is published in RMB on BigModel; USD prices vary by provider.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Alibaba Qwen3 Coder (1M)

2025-07 • 1M context window

Qwen3 coding-specialized model with long-context capabilities and strong tool-use.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude Opus 4 (200k)

2025-05 • 200K context window

Anthropic's most powerful model, excelling in coding, advanced reasoning, and AI agent workflows. Handles complex, long-running tasks.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4 (200k)

2025-05 • 200K context window

Anthropic's highly capable and versatile model, offering a strong balance of intelligence, speed, and cost-effectiveness for enterprise applications.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Google Gemini 2.5 Pro (1M)

2025-05 • 1M context window

Google's most advanced reasoning Gemini model, capable of solving complex problems. Supports text, code, image, audio, and video inputs. Features a 1M token context window (up to 2M in some versions).

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Mistral Medium 3 (128k)

2025-05 • 128K context window

Mistral AI's frontier-class multimodal model balancing SOTA performance, lower cost, and simpler deployability for enterprise usage. Excels in coding and multimodal understanding.

Input: $0.4000 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

Mistral Devstral Small (128k)

2025-05 • 128K context window

A 24B open-source text model from Mistral AI that excels at using tools to explore codebases, editing multiple files, and powering software engineering agents. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Google Gemini 2.5 Flash (1M)

2025-05 • 1M context window

Google's best model for price and performance (as of May 2025), featuring hybrid reasoning capabilities. Supports text, code, image, audio, and video inputs. 1M token context window.

Input: $0.1500 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

OpenAI Codex Mini (200k)

2025-05 • 200K context window

OpenAI's fast coding model designed for the Codex coding agent. Optimized for rapid code generation, editing, and review within development workflows.

Input: $1.5000 / 1K tokens

Output: $6.0000 / 1K tokens

View model details

Pika 2

2025-05 • N/A context window

Fast and creative video generation with emphasis on artistic styles and special effects. Free tier available with paid options.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI o3 (200k)

2025-04 • 200K context window

OpenAI reasoning model for complex tasks (text+image input, text output). Succeeded by GPT-5.x for many agentic workloads.

Input: $2.0000 / 1K tokens

Output: $8.0000 / 1K tokens

View model details

OpenAI o4-mini (200k)

2025-04 • 200K context window

A faster, cost-efficient reasoning model, successor to o3-mini, released in April 2025. Offers strong performance on math, coding, and vision. Can process text and images, and features autonomous tool use.

Input: $1.1000 / 1K tokens

Output: $4.4000 / 1K tokens

View model details

Alibaba Qwen3 235B MoE (128k)

2025-04 • 128K context window

Alibaba's flagship Qwen3 Mixture-of-Experts model with 235B total parameters (22B active). Features hybrid reasoning and supports 119 languages. (Note: Not publicly available at release).

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Alibaba Qwen3 30B MoE (128k)

2025-04 • 128K context window

Alibaba's Qwen3 Mixture-of-Experts model with 30B total parameters (3B active). Features hybrid reasoning and supports 119 languages. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Alibaba Qwen3 32B Dense (128k)

2025-04 • 128K context window

Alibaba's largest dense model in the Qwen3 family with 32B parameters. Features hybrid reasoning and supports 119 languages. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-4.1 (400k)

2025-04 • 400K context window

Smartest non-reasoning GPT model (text+image in, text out).

Input: $2.0000 / 1K tokens

Output: $8.0000 / 1K tokens

View model details

OpenAI GPT-4.1 mini (Unknown context)

2025-04 • Unknown context window

Smaller, faster GPT-4.1 tier with low cost.

Input: $0.4000 / 1K tokens

Output: $1.6000 / 1K tokens

View model details

OpenAI GPT-4.1 nano (Unknown context)

2025-04 • Unknown context window

Fastest, cheapest GPT-4.1 tier.

Input: $0.1000 / 1K tokens

Output: $0.4000 / 1K tokens

View model details

Google Gemini 2.5 Flash Preview (1M)

2025-04 • 1M context window

Preview of Google's Gemini 2.5 Flash with hybrid reasoning capabilities. Ultra-fast and cost-efficient for high-volume applications.

Input: $0.1500 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 4 Maverick (1M)

2025-04 • 1M context window

Meta's large Llama 4 Mixture-of-Experts model with 400B total parameters (17B active per expert, 128 experts). Natively multimodal with a 1M token context window. Open source.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Meta Llama 4 Scout (10M)

2025-04 • 10M context window

Meta's efficient Llama 4 model with an industry-leading 10M token context window. 109B total parameters (17B active per expert, 16 experts). Natively multimodal and open source.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Meta Llama 4 Maverick (1M)

2025-04 • 1M context window

Meta's Llama 4 Maverick with mixture-of-experts architecture, 1M context window, and strong multilingual support. Open-source model.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 4 Scout (10M)

2025-04 • 10M context window

Meta's Llama 4 Scout with an industry-leading 10M token context window and 16 experts MoE architecture. Optimized for efficiency.

Input: $0.1000 / 1K tokens

Output: $0.3000 / 1K tokens

View model details

Mistral Small 3.1 (128k)

2025-03 • 128K context window

A new leader in the small models category by Mistral AI, with image understanding capabilities and an extended 128k context length. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-4o with Search (128k)

2025-03 • 128K context window

GPT-4o variant with built-in web search grounding. Provides up-to-date, cited answers by searching the web in real time.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Google Gemini 2.5 Pro Preview (1M)

2025-03 • 1M context window

Early preview of Google's Gemini 2.5 Pro thinking model. Excels at reasoning, coding, and multimodal tasks with a 1M token context window.

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Cohere Command A (256k)

2025-03 • 256K context window

Cohere's next-generation enterprise model with an expanded 256K context window. Optimized for agentic RAG, tool use, and structured outputs.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Google Gemma 3 27B (128k)

2025-03 • 128K context window

Google's open-source 27B parameter model from the Gemma 3 family. Natively multimodal with strong performance on text, image, and video tasks. Free to use under open license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude 3.7 Sonnet (Deprecated, 200k)

2025-02 • 200K context window

Hybrid reasoning Claude 3.x model (extended thinking). Deprecated; recommended replacement is Claude Sonnet 4.5.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

OpenAI o3-mini (200k)

2025-02 • 200K context window

A faster, more cost-effective version of o3, released in January 2025. Offers strong reasoning, coding, and vision capabilities. Optimized for math and coding tasks.

Input: $1.1000 / 1K tokens

Output: $4.4000 / 1K tokens

View model details

Mistral Saba (32k)

2025-02 • 32K context window

A powerful and efficient model from Mistral AI for languages from the Middle East and South Asia.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

xAI Grok 3 (128k)

2025-02 • 128K context window

xAI's flagship large language model with strong reasoning, coding, and math capabilities. Trained on the Colossus supercluster.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

xAI Grok 3 Mini (128k)

2025-02 • 128K context window

xAI's lightweight reasoning model with think mode. Faster and more cost-efficient than Grok 3 while maintaining strong reasoning capabilities.

Input: $0.3000 / 1K tokens

Output: $0.5000 / 1K tokens

View model details

DeepSeek 3.1 (128k)

2025-01 • 128K context window

DeepSeek's enhanced model with improved reasoning capabilities, expanded context window, and even more competitive pricing.

Input: $0.1200 / 1K tokens

Output: $0.2400 / 1K tokens

View model details

Mistral Codestral 2 (256k)

2025-01 • 256K context window

Mistral AI's cutting-edge language model for coding (second version). Specializes in low-latency, high-frequency tasks like fill-in-the-middle (FIM), code correction, and test generation.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

DeepSeek R1 (128k)

2025-01 • 128K context window

DeepSeek's reasoning model trained with reinforcement learning. Excels at math, coding, and complex reasoning tasks with transparent chain-of-thought.

Input: $0.5500 / 1K tokens

Output: $2.1900 / 1K tokens

View model details

Google Gemini 2.0 Flash (1M)

2024-12 • 1M context window

Google's latest experimental model with breakthrough multimodal capabilities and enhanced reasoning at an extremely competitive price point.

Input: $0.0750 / 1K tokens

Output: $0.3000 / 1K tokens

View model details

Meta Llama 3.3 70B Instruct

2024-12 • 128K context window

Meta's latest 70B parameter model with improved performance and capabilities, offering state-of-the-art results for its size.

Input: $0.6000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

DeepSeek V3 (64k)

2024-12 • 64K context window

DeepSeek's latest model with strong performance across reasoning, coding, and general tasks at competitive pricing.

Input: $0.1400 / 1K tokens

Output: $0.2800 / 1K tokens

View model details

OpenAI o1 (200k)

2024-12 • 200K context window

Previous full o-series reasoning model (text+image in, text out).

Input: $15.0000 / 1K tokens

Output: $60.0000 / 1K tokens

View model details

OpenAI o1-pro (200k)

2024-12 • 200K context window

Higher-compute variant of o1 for better responses.

Input: $150.0000 / 1K tokens

Output: $600.0000 / 1K tokens

View model details

Microsoft Phi-4 (16k)

2024-12 • 16K context window

Microsoft's small language model with 14B parameters that punches well above its weight. Excels at STEM reasoning and coding despite its compact size. Open source under MIT license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Amazon Nova Pro (300k)

2024-12 • 300K context window

Amazon's highly capable multimodal model balancing accuracy, speed, and cost. Processes text, images, and video inputs for a wide range of enterprise tasks.

Input: $0.8000 / 1K tokens

Output: $3.2000 / 1K tokens

View model details

Amazon Nova Lite (300k)

2024-12 • 300K context window

Amazon's very low-cost multimodal model for high-volume tasks. Processes text, images, and video at extremely competitive pricing via Amazon Bedrock.

Input: $0.0600 / 1K tokens

Output: $0.2400 / 1K tokens

View model details

OpenAI Sora Turbo

2024-12 • N/A context window

Advanced text-to-video generation with realistic motion and scene understanding. Pricing per video generation varies by length and resolution.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude 3.5 Sonnet (200k)

2024-10 • 200K context window

Anthropic's most advanced model, significantly improved over Claude 3 Sonnet with enhanced reasoning, coding, and vision capabilities.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude 3.5 Haiku (Deprecated, 200k)

2024-10 • 200K context window

Claude 3.5 Haiku snapshot. Deprecated; recommended replacement is Haiku 4.5.

Input: $0.8000 / 1K tokens

Output: $4.0000 / 1K tokens

View model details

OpenAI o1-preview (Deprecated, 128k)

2024-09 • 128K context window

Deprecated preview snapshot of OpenAI's first o-series reasoning model. Kept for backwards compatibility; prefer o1 for production.

Input: $15.0000 / 1K tokens

Output: $60.0000 / 1K tokens

View model details

OpenAI o1-mini (128k)

2024-09 • 128K context window

Smaller, faster o-series reasoning model. Deprecated in favor of newer reasoning models, but still supported in legacy workflows.

Input: $1.1000 / 1K tokens

Output: $4.4000 / 1K tokens

View model details

Meta Llama 3.2 90B Vision Instruct

2024-09 • 128K context window

Meta's multimodal model combining text and vision capabilities with strong performance across various tasks.

Input: $1.2000 / 1K tokens

Output: $1.2000 / 1K tokens

View model details

Meta Llama 3.2 11B Vision Instruct

2024-09 • 128K context window

A smaller, efficient multimodal model from Meta with vision capabilities, suitable for edge deployment and cost-sensitive applications.

Input: $0.1800 / 1K tokens

Output: $0.1800 / 1K tokens

View model details

Mistral Small (32k)

2024-09 • 32K context window

Mistral AI's cost-effective model for straightforward tasks, offering good performance and efficiency.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Qwen 2.5 72B Instruct

2024-09 • 32K context window

Alibaba's large language model with strong performance in reasoning, coding, and multilingual tasks.

Input: $0.5600 / 1K tokens

Output: $0.5600 / 1K tokens

View model details

AI21 Jamba 1.5 Large (256k)

2024-08 • 256K context window

AI21's large hybrid SSM-Transformer model with a 256K context window. Uses a novel Jamba architecture combining Mamba SSM layers with Transformer attention for efficient long-context processing.

Input: $2.0000 / 1K tokens

Output: $8.0000 / 1K tokens

View model details

OpenAI GPT-4o mini (128k)

2024-07 • 128K context window

OpenAI's most affordable and fastest model in the GPT-4o family, designed for high-volume, low-latency tasks.

Input: $0.1500 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 3.1 405B Instruct

2024-07 • 128K context window

Meta's largest and most capable Llama 3.1 model, designed for complex reasoning, coding, and nuanced instruction following.

Input: $2.7000 / 1K tokens

Output: $2.7000 / 1K tokens

View model details

Meta Llama 3.1 70B Instruct

2024-07 • 128K context window

A large instruction-tuned model from Meta's Llama 3.1 series, offering a strong balance of performance and efficiency for a wide range of tasks.

Input: $0.6000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 3.1 8B Instruct

2024-07 • 128K context window

A highly efficient instruction-tuned model from Meta's Llama 3.1 series, suitable for fast, on-device, or edge applications.

Input: $0.0600 / 1K tokens

Output: $0.0600 / 1K tokens

View model details

Mistral Large 2 (128k)

2024-07 • 128K context window

Mistral AI's flagship model with enhanced reasoning, coding, and multilingual capabilities.

Input: $2.0000 / 1K tokens

Output: $6.0000 / 1K tokens

View model details

Runway Gen-3 Alpha

2024-06 • N/A context window

Professional video generation and editing AI with motion control and style consistency. Subscription-based pricing.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-4o (128k)

2024-05 • 128K context window

OpenAI's flagship multimodal model, natively processing text, audio, and images for faster, more capable interactions.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Google Gemini 1.5 Flash (1M)

2024-05 • 1M context window

Google's faster and lower-cost version of Gemini 1.5 Pro, optimized for high-volume, high-frequency tasks while retaining a large context window and multimodal capabilities.

Input: $0.0750 / 1K tokens

Output: $0.3000 / 1K tokens

View model details

Mistral Codestral (22B)

2024-05 • 32K context window

Mistral AI's open-weight generative model specialized for code generation, supporting 80+ languages.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

OpenAI GPT-4 Turbo (128k)

2024-04 • 128K context window

OpenAI's powerful model prior to GPT-4o, with a large context window and strong performance on complex tasks. Supports vision.

Input: $10.0000 / 1K tokens

Output: $30.0000 / 1K tokens

View model details

Cohere Command R+ (128k)

2024-04 • 128K context window

Cohere's most powerful model optimized for enterprise RAG and tool use. Excels at grounded generation with citations and multi-step tool workflows.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Anthropic Claude 3 Opus (200k)

2024-03 • 200K context window

Claude 3 Opus (deprecated) � previous highest-intelligence Claude 3 model.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude 3 Sonnet (200k)

2024-03 • 200K context window

A balanced model from Anthropic, offering a blend of intelligence and speed, ideal for enterprise workloads and scaled AI deployments.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude 3 Haiku (200k)

2024-03 • 200K context window

Anthropic's fastest and most compact model, designed for near-instant responsiveness and high throughput tasks.

Input: $0.2500 / 1K tokens

Output: $1.2500 / 1K tokens

View model details

Google Gemini 1.5 Pro (2M)

2024-02 • 2M context window

Google's highly capable multimodal model with a breakthrough long context window of up to 2 million tokens. Excels at complex reasoning, problem-solving, and understanding long-form content.

Input: $1.2500 / 1K tokens

Output: $5.0000 / 1K tokens

View model details

Leonardo AI

2023-11 • N/A context window

Game asset and creative image generation with consistent character and style creation. Credit-based pricing system.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI DALL-E 3

2023-10 • N/A context window

State-of-the-art text-to-image generation with improved prompt following and image quality. Pricing per image: HD 1024×1024 $0.040, Standard $0.020

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Stable Diffusion XL

2023-07 • N/A context window

Open-source image generation model with fine-tuning capabilities. API pricing varies by provider, self-hosting available.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-3.5 Turbo (16k)

2023-03 • 16K context window

OpenAI's fast, cost-effective model optimized for chat and simple tasks.

Input: $0.5000 / 1K tokens

Output: $1.5000 / 1K tokens

View model details

OpenAI GPT-5.2 (400k)

Unknown • 400K context window

OpenAI flagship model for coding and agentic tasks across industries.

Input: $1.7500 / 1K tokens

Output: $14.0000 / 1K tokens

View model details

OpenAI GPT-5.1 (400k)

Unknown • 400K context window

Flagship GPT model with configurable reasoning effort; predecessor to GPT-5.2.

Input: $1.7500 / 1K tokens

Output: $14.0000 / 1K tokens

View model details

OpenAI GPT-5 Nano (400k)

Unknown • 400K context window

Fastest, most cost-efficient GPT-5 variant.

Input: $0.0500 / 1K tokens

Output: $0.4000 / 1K tokens

View model details

OpenAI GPT-5.1-Codex (400k)

Unknown • 400K context window

GPT-5.1 variant optimized for agentic coding in Codex (Responses API only).

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

OpenAI GPT-5.1-Codex-Max (400k)

Unknown • 400K context window

Most intelligent Codex model optimized for long-horizon agentic coding (Responses API only).

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

OpenAI o3-pro (200k)

Unknown • 200K context window

Version of o3 with more compute for better responses (Responses API only).

Input: $20.0000 / 1K tokens

Output: $80.0000 / 1K tokens

View model details

OpenAI GPT-OSS 120B (Open-Weight)

Unknown • Unknown context window

Open-weight model entry as listed by OpenAI (see models page). Token costs depend on where you run it.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-OSS 20B (Open-Weight)

Unknown • Unknown context window

Open-weight model entry as listed by OpenAI (see models page). Token costs depend on where you run it.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

All Models Models

OpenAI GPT-5.5

2026-04 • 512K context window

OpenAI's April 2026 flagship: GPT-5.5 ships materially better long-context reasoning than 5.4, a 512K default context window, doubled multimodal capability, and lower API pricing. Targets premium production workloads where 5.4 was leaving capability on the table.

Input: $4.0000 / 1K tokens

Output: $16.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.7

2026-04 • 1M context window

Anthropic's April 2026 flagship. Drops the long-context premium SKU — 1M token context is the default tier. ~30% lower median latency than 4.6 on long-context requests, materially better long-context retrieval (96.4% at 1M vs 91% for 4.6), and adds a `thinking_budget_tokens` parameter for explicit cost control on extended reasoning. Released alongside the easing of the March 2026 peak-hour Pro/Max throttle.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

GPT-5.4

2026-03 • 256K context window

OpenAI's most advanced model with enhanced reasoning, longer context, and multimodal capabilities. Top of the line for complex tasks.

Input: $25.0000 / 1K tokens

Output: $100.0000 / 1K tokens

View model details

GPT-5.4 Mini

2026-03 • 128K context window

Affordable version of GPT-5.4 with strong performance for everyday tasks at a fraction of the cost.

Input: $3.0000 / 1K tokens

Output: $12.0000 / 1K tokens

View model details

Gemini 3.1 Pro

2026-03 • 2M context window

Google's latest Gemini 3.1 Pro with 2M context window, improved code understanding, and enhanced factual accuracy.

Input: $4.0000 / 1K tokens

Output: $16.0000 / 1K tokens

View model details

Gemini 3.1 Flash

2026-03 • 1M context window

Fast and affordable Gemini 3.1 Flash optimized for high-throughput applications at minimal cost.

Input: $0.3500 / 1K tokens

Output: $1.4000 / 1K tokens

View model details

xAI Grok 3.5 (256k)

2026-03 • 256K context window

xAI's latest flagship model with expanded context, improved reasoning, and deeper integration with real-time data from X platform.

Input: $5.0000 / 1K tokens

Output: $25.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.6 (1M)

2026-02 • 1M context window

Anthropic's Claude Opus 4.6 with extended 1M token context window for processing entire codebases, books, and massive datasets in a single prompt.

Input: $18.0000 / 1K tokens

Output: $90.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.6 (200k)

2026-02 • 200K context window

Anthropic's latest flagship model with best-in-class reasoning, coding, and agentic capabilities. Supports extended thinking for complex multi-step problems.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4.6 (200k)

2026-02 • 200K context window

Anthropic's balanced mid-tier model offering strong intelligence, speed, and cost-effectiveness. Excellent for everyday coding and enterprise tasks.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

DeepSeek V4 (128k)

2026-02 • 128K context window

DeepSeek's fourth-generation open-weights model with state-of-the-art reasoning at remarkably low cost. Strong performance on coding and math benchmarks.

Input: $0.5000 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

OpenAI o3 Deep Research (200k)

2026-01 • 200K context window

An o3-family model optimized for deep research tasks. Autonomously browses the web, synthesizes information, and produces comprehensive research reports.

Input: $20.0000 / 1K tokens

Output: $80.0000 / 1K tokens

View model details

Mistral Large 3 (128k)

2026-01 • 128K context window

Mistral's latest flagship model with multilingual excellence, strong coding, and enterprise-grade function calling. Open-weight model.

Input: $2.0000 / 1K tokens

Output: $6.0000 / 1K tokens

View model details

Alibaba Qwen 3 (128k)

2026-01 • 128K context window

Alibaba's latest Qwen 3 flagship model with strong multilingual capabilities and improved reasoning. Competitive with frontier models at lower cost.

Input: $0.8000 / 1K tokens

Output: $3.2000 / 1K tokens

View model details

Midjourney v7

2026-01 • N/A context window

Professional AI image generation with photorealistic quality and artistic control. Subscription-based: Basic $10/mo, Standard $30/mo, Pro $60/mo

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-5.2 Pro (400k)

2025-12 • 400K context window

Higher-compute variant of GPT-5.2 for harder problems (Responses API only).

Input: $21.0000 / 1K tokens

Output: $168.0000 / 1K tokens

View model details

Google Gemini 3 Flash Preview (1M)

2025-12 • 1M context window

Gemini 3 Flash Preview on Vertex AI.

Input: $0.5000 / 1K tokens

Output: $3.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.5 (200k)

2025-11 • 200K context window

Claude 4.5 flagship Opus model for long-horizon coding and agentic workflows.

Input: $5.0000 / 1K tokens

Output: $25.0000 / 1K tokens

View model details

Google Gemini 3 Pro Preview (1M)

2025-11 • 1M context window

Gemini 3 Pro Preview on Vertex AI.

Input: $2.0000 / 1K tokens

Output: $12.0000 / 1K tokens

View model details

Anthropic Claude Haiku 4.5 (200k)

2025-10 • 200K context window

Anthropic's fastest 4.5-series model�cost-effective for high-volume workloads with extended thinking and strong tool use.

Input: $1.0000 / 1K tokens

Output: $5.0000 / 1K tokens

View model details

Cognition SWE-1.5 (Windsurf in-house)

2025-10 • Unknown context window

Windsurf/Cognition in-house frontier model for agentic coding. Consumed via Windsurf prompt credits (not USD per-token).

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude Haiku 4.5 (200k)

2025-10 • 200K context window

Anthropic's fastest and most cost-effective 4.5-series model, optimized for high-volume workloads with strong tool use capabilities.

Input: $0.8000 / 1K tokens

Output: $4.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4.5 (200k)

2025-09-30 • 200K context window

Anthropic's latest and most advanced Sonnet model released September 30, 2025. Features dramatically improved coding capabilities, enhanced reasoning, and better long-context performance. Outperforms Claude 4 Sonnet on all major benchmarks while maintaining cost efficiency.

Input: $3.5000 / 1K tokens

Output: $17.5000 / 1K tokens

View model details

OpenAI GPT-5-Codex (400k)

2025-09 • 400K context window

A version of GPT-5 optimized for agentic coding in Codex. Default in Codex cloud & reviews; also usable via API key. Priced the same as GPT-5.

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4.5 (200k/1M beta)

2025-09 • 1M context window

Anthropic's most intelligent model for agents and coding, with extended thinking and state-of-the-art performance on SWE-bench Verified.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Google Gemini 3.0 Ultra (2M)

2025-09 • 2M context window

Google's next-generation flagship model with breakthrough multimodal capabilities and 2M token context window. Features advanced reasoning, native code execution, and real-time multimodal understanding across text, image, audio, and video.

Input: $0.2000 / 1K tokens

Output: $0.8000 / 1K tokens

View model details

Google Gemini 3.0 (2M)

2025-09 • 2M context window

Next-generation Gemini with advanced multimodal reasoning and long context. Rates pending official pricing page.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.1 (200k)

2025-08 • 200K context window

Opus 4.1 (200k) � higher-cost flagship Opus tier. Newer Opus 4.5 provides a more accessible price point.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

OpenAI GPT-5 Pro (400k)

2025-08 • 400K context window

Version of GPT-5 that produces smarter and more precise responses. Responses API only; higher max output than standard GPT-5.

Input: $15.0000 / 1K tokens

Output: $120.0000 / 1K tokens

View model details

OpenAI GPT-5 (400k)

2025-08 • 400K context window

Previous flagship GPT model for coding, reasoning, and agentic tasks. OpenAI recommends GPT-5.1/5.2 for newest improvements.

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

OpenAI GPT-5 Mini (400k)

2025-08 • 400K context window

A faster, lower-cost GPT-5 for well-defined tasks. Text & vision with long context at a fraction of the price.

Input: $0.2500 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

OpenAI GPT-5 Low (1M)

2025-08 • 1M context window

Alias tier aligned with GPT-5 Mini pricing; use when your workflow targets the "low" cost tier.

Input: $0.2500 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

OpenAI GPT-5 Medium (1M)

2025-08 • 1M context window

Mid-tier GPT-5 variant balancing performance and cost for general-purpose workloads.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-5 High (1M)

2025-08 • 1M context window

High-tier GPT-5 with enhanced reasoning, reliability and coding for mission-critical workloads.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Moonshot Kimi K2 Base (128k)

2025-07 • 128K context window

A 1T parameter open-weight Mixture-of-Experts (MoE) model with 32B active parameters. This is the unaligned, pre-trained base model, suitable for further fine-tuning.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Moonshot Kimi K2 Instruct (128k)

2025-07 • 128K context window

The instruction-tuned version of Kimi K2, optimized for chat, agentic tasks, and tool use. Aligned with RLHF for helpful and safe responses.

Input: $0.1500 / 1K tokens

Output: $2.5000 / 1K tokens

View model details

Alibaba Qwen3 Coder Flash (1M)

2025-07 • 1M context window

A 30B parameter model from the Qwen3 series, excelling in coding and agentic tasks with a 1M token context length.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Zhipu GLM-4.5 (200k)

2025-07 • 200K context window

GLM-4.5 agentic foundation model. Official pricing is published in RMB on BigModel; USD prices vary by provider.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Alibaba Qwen3 Coder (1M)

2025-07 • 1M context window

Qwen3 coding-specialized model with long-context capabilities and strong tool-use.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude Opus 4 (200k)

2025-05 • 200K context window

Anthropic's most powerful model, excelling in coding, advanced reasoning, and AI agent workflows. Handles complex, long-running tasks.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4 (200k)

2025-05 • 200K context window

Anthropic's highly capable and versatile model, offering a strong balance of intelligence, speed, and cost-effectiveness for enterprise applications.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Google Gemini 2.5 Pro (1M)

2025-05 • 1M context window

Google's most advanced reasoning Gemini model, capable of solving complex problems. Supports text, code, image, audio, and video inputs. Features a 1M token context window (up to 2M in some versions).

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Mistral Medium 3 (128k)

2025-05 • 128K context window

Mistral AI's frontier-class multimodal model balancing SOTA performance, lower cost, and simpler deployability for enterprise usage. Excels in coding and multimodal understanding.

Input: $0.4000 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

Mistral Devstral Small (128k)

2025-05 • 128K context window

A 24B open-source text model from Mistral AI that excels at using tools to explore codebases, editing multiple files, and powering software engineering agents. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Google Gemini 2.5 Flash (1M)

2025-05 • 1M context window

Google's best model for price and performance (as of May 2025), featuring hybrid reasoning capabilities. Supports text, code, image, audio, and video inputs. 1M token context window.

Input: $0.1500 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

OpenAI Codex Mini (200k)

2025-05 • 200K context window

OpenAI's fast coding model designed for the Codex coding agent. Optimized for rapid code generation, editing, and review within development workflows.

Input: $1.5000 / 1K tokens

Output: $6.0000 / 1K tokens

View model details

Pika 2

2025-05 • N/A context window

Fast and creative video generation with emphasis on artistic styles and special effects. Free tier available with paid options.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI o3 (200k)

2025-04 • 200K context window

OpenAI reasoning model for complex tasks (text+image input, text output). Succeeded by GPT-5.x for many agentic workloads.

Input: $2.0000 / 1K tokens

Output: $8.0000 / 1K tokens

View model details

OpenAI o4-mini (200k)

2025-04 • 200K context window

A faster, cost-efficient reasoning model, successor to o3-mini, released in April 2025. Offers strong performance on math, coding, and vision. Can process text and images, and features autonomous tool use.

Input: $1.1000 / 1K tokens

Output: $4.4000 / 1K tokens

View model details

Alibaba Qwen3 235B MoE (128k)

2025-04 • 128K context window

Alibaba's flagship Qwen3 Mixture-of-Experts model with 235B total parameters (22B active). Features hybrid reasoning and supports 119 languages. (Note: Not publicly available at release).

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Alibaba Qwen3 30B MoE (128k)

2025-04 • 128K context window

Alibaba's Qwen3 Mixture-of-Experts model with 30B total parameters (3B active). Features hybrid reasoning and supports 119 languages. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Alibaba Qwen3 32B Dense (128k)

2025-04 • 128K context window

Alibaba's largest dense model in the Qwen3 family with 32B parameters. Features hybrid reasoning and supports 119 languages. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-4.1 (400k)

2025-04 • 400K context window

Smartest non-reasoning GPT model (text+image in, text out).

Input: $2.0000 / 1K tokens

Output: $8.0000 / 1K tokens

View model details

OpenAI GPT-4.1 mini (Unknown context)

2025-04 • Unknown context window

Smaller, faster GPT-4.1 tier with low cost.

Input: $0.4000 / 1K tokens

Output: $1.6000 / 1K tokens

View model details

OpenAI GPT-4.1 nano (Unknown context)

2025-04 • Unknown context window

Fastest, cheapest GPT-4.1 tier.

Input: $0.1000 / 1K tokens

Output: $0.4000 / 1K tokens

View model details

Google Gemini 2.5 Flash Preview (1M)

2025-04 • 1M context window

Preview of Google's Gemini 2.5 Flash with hybrid reasoning capabilities. Ultra-fast and cost-efficient for high-volume applications.

Input: $0.1500 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 4 Maverick (1M)

2025-04 • 1M context window

Meta's large Llama 4 Mixture-of-Experts model with 400B total parameters (17B active per expert, 128 experts). Natively multimodal with a 1M token context window. Open source.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Meta Llama 4 Scout (10M)

2025-04 • 10M context window

Meta's efficient Llama 4 model with an industry-leading 10M token context window. 109B total parameters (17B active per expert, 16 experts). Natively multimodal and open source.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Meta Llama 4 Maverick (1M)

2025-04 • 1M context window

Meta's Llama 4 Maverick with mixture-of-experts architecture, 1M context window, and strong multilingual support. Open-source model.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 4 Scout (10M)

2025-04 • 10M context window

Meta's Llama 4 Scout with an industry-leading 10M token context window and 16 experts MoE architecture. Optimized for efficiency.

Input: $0.1000 / 1K tokens

Output: $0.3000 / 1K tokens

View model details

Mistral Small 3.1 (128k)

2025-03 • 128K context window

A new leader in the small models category by Mistral AI, with image understanding capabilities and an extended 128k context length. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-4o with Search (128k)

2025-03 • 128K context window

GPT-4o variant with built-in web search grounding. Provides up-to-date, cited answers by searching the web in real time.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Google Gemini 2.5 Pro Preview (1M)

2025-03 • 1M context window

Early preview of Google's Gemini 2.5 Pro thinking model. Excels at reasoning, coding, and multimodal tasks with a 1M token context window.

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Cohere Command A (256k)

2025-03 • 256K context window

Cohere's next-generation enterprise model with an expanded 256K context window. Optimized for agentic RAG, tool use, and structured outputs.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Google Gemma 3 27B (128k)

2025-03 • 128K context window

Google's open-source 27B parameter model from the Gemma 3 family. Natively multimodal with strong performance on text, image, and video tasks. Free to use under open license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude 3.7 Sonnet (Deprecated, 200k)

2025-02 • 200K context window

Hybrid reasoning Claude 3.x model (extended thinking). Deprecated; recommended replacement is Claude Sonnet 4.5.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

OpenAI o3-mini (200k)

2025-02 • 200K context window

A faster, more cost-effective version of o3, released in January 2025. Offers strong reasoning, coding, and vision capabilities. Optimized for math and coding tasks.

Input: $1.1000 / 1K tokens

Output: $4.4000 / 1K tokens

View model details

Mistral Saba (32k)

2025-02 • 32K context window

A powerful and efficient model from Mistral AI for languages from the Middle East and South Asia.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

xAI Grok 3 (128k)

2025-02 • 128K context window

xAI's flagship large language model with strong reasoning, coding, and math capabilities. Trained on the Colossus supercluster.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

xAI Grok 3 Mini (128k)

2025-02 • 128K context window

xAI's lightweight reasoning model with think mode. Faster and more cost-efficient than Grok 3 while maintaining strong reasoning capabilities.

Input: $0.3000 / 1K tokens

Output: $0.5000 / 1K tokens

View model details

DeepSeek 3.1 (128k)

2025-01 • 128K context window

DeepSeek's enhanced model with improved reasoning capabilities, expanded context window, and even more competitive pricing.

Input: $0.1200 / 1K tokens

Output: $0.2400 / 1K tokens

View model details

Mistral Codestral 2 (256k)

2025-01 • 256K context window

Mistral AI's cutting-edge language model for coding (second version). Specializes in low-latency, high-frequency tasks like fill-in-the-middle (FIM), code correction, and test generation.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

DeepSeek R1 (128k)

2025-01 • 128K context window

DeepSeek's reasoning model trained with reinforcement learning. Excels at math, coding, and complex reasoning tasks with transparent chain-of-thought.

Input: $0.5500 / 1K tokens

Output: $2.1900 / 1K tokens

View model details

Google Gemini 2.0 Flash (1M)

2024-12 • 1M context window

Google's latest experimental model with breakthrough multimodal capabilities and enhanced reasoning at an extremely competitive price point.

Input: $0.0750 / 1K tokens

Output: $0.3000 / 1K tokens

View model details

Meta Llama 3.3 70B Instruct

2024-12 • 128K context window

Meta's latest 70B parameter model with improved performance and capabilities, offering state-of-the-art results for its size.

Input: $0.6000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

DeepSeek V3 (64k)

2024-12 • 64K context window

DeepSeek's latest model with strong performance across reasoning, coding, and general tasks at competitive pricing.

Input: $0.1400 / 1K tokens

Output: $0.2800 / 1K tokens

View model details

OpenAI o1 (200k)

2024-12 • 200K context window

Previous full o-series reasoning model (text+image in, text out).

Input: $15.0000 / 1K tokens

Output: $60.0000 / 1K tokens

View model details

OpenAI o1-pro (200k)

2024-12 • 200K context window

Higher-compute variant of o1 for better responses.

Input: $150.0000 / 1K tokens

Output: $600.0000 / 1K tokens

View model details

Microsoft Phi-4 (16k)

2024-12 • 16K context window

Microsoft's small language model with 14B parameters that punches well above its weight. Excels at STEM reasoning and coding despite its compact size. Open source under MIT license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Amazon Nova Pro (300k)

2024-12 • 300K context window

Amazon's highly capable multimodal model balancing accuracy, speed, and cost. Processes text, images, and video inputs for a wide range of enterprise tasks.

Input: $0.8000 / 1K tokens

Output: $3.2000 / 1K tokens

View model details

Amazon Nova Lite (300k)

2024-12 • 300K context window

Amazon's very low-cost multimodal model for high-volume tasks. Processes text, images, and video at extremely competitive pricing via Amazon Bedrock.

Input: $0.0600 / 1K tokens

Output: $0.2400 / 1K tokens

View model details

OpenAI Sora Turbo

2024-12 • N/A context window

Advanced text-to-video generation with realistic motion and scene understanding. Pricing per video generation varies by length and resolution.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Claude 3.5 Sonnet (200k)

2024-10 • 200K context window

Anthropic's most advanced model, significantly improved over Claude 3 Sonnet with enhanced reasoning, coding, and vision capabilities.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude 3.5 Haiku (Deprecated, 200k)

2024-10 • 200K context window

Claude 3.5 Haiku snapshot. Deprecated; recommended replacement is Haiku 4.5.

Input: $0.8000 / 1K tokens

Output: $4.0000 / 1K tokens

View model details

OpenAI o1-preview (Deprecated, 128k)

2024-09 • 128K context window

Deprecated preview snapshot of OpenAI's first o-series reasoning model. Kept for backwards compatibility; prefer o1 for production.

Input: $15.0000 / 1K tokens

Output: $60.0000 / 1K tokens

View model details

OpenAI o1-mini (128k)

2024-09 • 128K context window

Smaller, faster o-series reasoning model. Deprecated in favor of newer reasoning models, but still supported in legacy workflows.

Input: $1.1000 / 1K tokens

Output: $4.4000 / 1K tokens

View model details

Meta Llama 3.2 90B Vision Instruct

2024-09 • 128K context window

Meta's multimodal model combining text and vision capabilities with strong performance across various tasks.

Input: $1.2000 / 1K tokens

Output: $1.2000 / 1K tokens

View model details

Meta Llama 3.2 11B Vision Instruct

2024-09 • 128K context window

A smaller, efficient multimodal model from Meta with vision capabilities, suitable for edge deployment and cost-sensitive applications.

Input: $0.1800 / 1K tokens

Output: $0.1800 / 1K tokens

View model details

Mistral Small (32k)

2024-09 • 32K context window

Mistral AI's cost-effective model for straightforward tasks, offering good performance and efficiency.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Qwen 2.5 72B Instruct

2024-09 • 32K context window

Alibaba's large language model with strong performance in reasoning, coding, and multilingual tasks.

Input: $0.5600 / 1K tokens

Output: $0.5600 / 1K tokens

View model details

AI21 Jamba 1.5 Large (256k)

2024-08 • 256K context window

AI21's large hybrid SSM-Transformer model with a 256K context window. Uses a novel Jamba architecture combining Mamba SSM layers with Transformer attention for efficient long-context processing.

Input: $2.0000 / 1K tokens

Output: $8.0000 / 1K tokens

View model details

OpenAI GPT-4o mini (128k)

2024-07 • 128K context window

OpenAI's most affordable and fastest model in the GPT-4o family, designed for high-volume, low-latency tasks.

Input: $0.1500 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 3.1 405B Instruct

2024-07 • 128K context window

Meta's largest and most capable Llama 3.1 model, designed for complex reasoning, coding, and nuanced instruction following.

Input: $2.7000 / 1K tokens

Output: $2.7000 / 1K tokens

View model details

Meta Llama 3.1 70B Instruct

2024-07 • 128K context window

A large instruction-tuned model from Meta's Llama 3.1 series, offering a strong balance of performance and efficiency for a wide range of tasks.

Input: $0.6000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 3.1 8B Instruct

2024-07 • 128K context window

A highly efficient instruction-tuned model from Meta's Llama 3.1 series, suitable for fast, on-device, or edge applications.

Input: $0.0600 / 1K tokens

Output: $0.0600 / 1K tokens

View model details

Mistral Large 2 (128k)

2024-07 • 128K context window

Mistral AI's flagship model with enhanced reasoning, coding, and multilingual capabilities.

Input: $2.0000 / 1K tokens

Output: $6.0000 / 1K tokens

View model details

Runway Gen-3 Alpha

2024-06 • N/A context window

Professional video generation and editing AI with motion control and style consistency. Subscription-based pricing.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-4o (128k)

2024-05 • 128K context window

OpenAI's flagship multimodal model, natively processing text, audio, and images for faster, more capable interactions.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Google Gemini 1.5 Flash (1M)

2024-05 • 1M context window

Google's faster and lower-cost version of Gemini 1.5 Pro, optimized for high-volume, high-frequency tasks while retaining a large context window and multimodal capabilities.

Input: $0.0750 / 1K tokens

Output: $0.3000 / 1K tokens

View model details

Mistral Codestral (22B)

2024-05 • 32K context window

Mistral AI's open-weight generative model specialized for code generation, supporting 80+ languages.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

OpenAI GPT-4 Turbo (128k)

2024-04 • 128K context window

OpenAI's powerful model prior to GPT-4o, with a large context window and strong performance on complex tasks. Supports vision.

Input: $10.0000 / 1K tokens

Output: $30.0000 / 1K tokens

View model details

Cohere Command R+ (128k)

2024-04 • 128K context window

Cohere's most powerful model optimized for enterprise RAG and tool use. Excels at grounded generation with citations and multi-step tool workflows.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Anthropic Claude 3 Opus (200k)

2024-03 • 200K context window

Claude 3 Opus (deprecated) � previous highest-intelligence Claude 3 model.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude 3 Sonnet (200k)

2024-03 • 200K context window

A balanced model from Anthropic, offering a blend of intelligence and speed, ideal for enterprise workloads and scaled AI deployments.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude 3 Haiku (200k)

2024-03 • 200K context window

Anthropic's fastest and most compact model, designed for near-instant responsiveness and high throughput tasks.

Input: $0.2500 / 1K tokens

Output: $1.2500 / 1K tokens

View model details

Google Gemini 1.5 Pro (2M)

2024-02 • 2M context window

Google's highly capable multimodal model with a breakthrough long context window of up to 2 million tokens. Excels at complex reasoning, problem-solving, and understanding long-form content.

Input: $1.2500 / 1K tokens

Output: $5.0000 / 1K tokens

View model details

Leonardo AI

2023-11 • N/A context window

Game asset and creative image generation with consistent character and style creation. Credit-based pricing system.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI DALL-E 3

2023-10 • N/A context window

State-of-the-art text-to-image generation with improved prompt following and image quality. Pricing per image: HD 1024×1024 $0.040, Standard $0.020

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Stable Diffusion XL

2023-07 • N/A context window

Open-source image generation model with fine-tuning capabilities. API pricing varies by provider, self-hosting available.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-3.5 Turbo (16k)

2023-03 • 16K context window

OpenAI's fast, cost-effective model optimized for chat and simple tasks.

Input: $0.5000 / 1K tokens

Output: $1.5000 / 1K tokens

View model details

OpenAI GPT-5.2 (400k)

Unknown • 400K context window

OpenAI flagship model for coding and agentic tasks across industries.

Input: $1.7500 / 1K tokens

Output: $14.0000 / 1K tokens

View model details

OpenAI GPT-5.1 (400k)

Unknown • 400K context window

Flagship GPT model with configurable reasoning effort; predecessor to GPT-5.2.

Input: $1.7500 / 1K tokens

Output: $14.0000 / 1K tokens

View model details

OpenAI GPT-5 Nano (400k)

Unknown • 400K context window

Fastest, most cost-efficient GPT-5 variant.

Input: $0.0500 / 1K tokens

Output: $0.4000 / 1K tokens

View model details

OpenAI GPT-5.1-Codex (400k)

Unknown • 400K context window

GPT-5.1 variant optimized for agentic coding in Codex (Responses API only).

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

OpenAI GPT-5.1-Codex-Max (400k)

Unknown • 400K context window

Most intelligent Codex model optimized for long-horizon agentic coding (Responses API only).

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

OpenAI o3-pro (200k)

Unknown • 200K context window

Version of o3 with more compute for better responses (Responses API only).

Input: $20.0000 / 1K tokens

Output: $80.0000 / 1K tokens

View model details

OpenAI GPT-OSS 120B (Open-Weight)

Unknown • Unknown context window

Open-weight model entry as listed by OpenAI (see models page). Token costs depend on where you run it.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-OSS 20B (Open-Weight)

Unknown • Unknown context window

Open-weight model entry as listed by OpenAI (see models page). Token costs depend on where you run it.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details