LLM Models

Compare pricing and specifications for large language models from all major providers.

All Models Models

OpenAI GPT-5.5

2026-04 • 512K context window

OpenAI's April 2026 flagship: GPT-5.5 ships materially better long-context reasoning than 5.4, a 512K default context window, doubled multimodal capability, and lower API pricing. Targets premium production workloads where 5.4 was leaving capability on the table.

Input: $4.0000 / 1K tokens
Output: $16.0000 / 1K tokens

Anthropic Claude Opus 4.7

2026-04 • 1M context window

Anthropic's April 2026 flagship. Drops the long-context premium SKU — 1M token context is the default tier. ~30% lower median latency than 4.6 on long-context requests, materially better long-context retrieval (96.4% at 1M vs 91% for 4.6), and adds a `thinking_budget_tokens` parameter for explicit cost control on extended reasoning. Released alongside the easing of the March 2026 peak-hour Pro/Max throttle.

Input: $15.0000 / 1K tokens
Output: $75.0000 / 1K tokens

GPT-5.4

2026-03 • 256K context window

OpenAI's most advanced model with enhanced reasoning, longer context, and multimodal capabilities. Top of the line for complex tasks.

Input: $25.0000 / 1K tokens
Output: $100.0000 / 1K tokens

GPT-5.4 Mini

2026-03 • 128K context window

Affordable version of GPT-5.4 with strong performance for everyday tasks at a fraction of the cost.

Input: $3.0000 / 1K tokens
Output: $12.0000 / 1K tokens

Gemini 3.1 Pro

2026-03 • 2M context window

Google's latest Gemini 3.1 Pro with 2M context window, improved code understanding, and enhanced factual accuracy.

Input: $4.0000 / 1K tokens
Output: $16.0000 / 1K tokens

Gemini 3.1 Flash

2026-03 • 1M context window

Fast and affordable Gemini 3.1 Flash optimized for high-throughput applications at minimal cost.

Input: $0.3500 / 1K tokens
Output: $1.4000 / 1K tokens

xAI Grok 3.5 (256k)

2026-03 • 256K context window

xAI's latest flagship model with expanded context, improved reasoning, and deeper integration with real-time data from X platform.

Input: $5.0000 / 1K tokens
Output: $25.0000 / 1K tokens

Anthropic Claude Opus 4.6 (1M)

2026-02 • 1M context window

Anthropic's Claude Opus 4.6 with extended 1M token context window for processing entire codebases, books, and massive datasets in a single prompt.

Input: $18.0000 / 1K tokens
Output: $90.0000 / 1K tokens

Anthropic Claude Opus 4.6 (200k)

2026-02 • 200K context window

Anthropic's latest flagship model with best-in-class reasoning, coding, and agentic capabilities. Supports extended thinking for complex multi-step problems.

Input: $15.0000 / 1K tokens
Output: $75.0000 / 1K tokens

Anthropic Claude Sonnet 4.6 (200k)

2026-02 • 200K context window

Anthropic's balanced mid-tier model offering strong intelligence, speed, and cost-effectiveness. Excellent for everyday coding and enterprise tasks.

Input: $3.0000 / 1K tokens
Output: $15.0000 / 1K tokens

DeepSeek V4 (128k)

2026-02 • 128K context window

DeepSeek's fourth-generation open-weights model with state-of-the-art reasoning at remarkably low cost. Strong performance on coding and math benchmarks.

Input: $0.5000 / 1K tokens
Output: $2.0000 / 1K tokens

OpenAI o3 Deep Research (200k)

2026-01 • 200K context window

An o3-family model optimized for deep research tasks. Autonomously browses the web, synthesizes information, and produces comprehensive research reports.

Input: $20.0000 / 1K tokens
Output: $80.0000 / 1K tokens

Mistral Large 3 (128k)

2026-01 • 128K context window

Mistral's latest flagship model with multilingual excellence, strong coding, and enterprise-grade function calling. Open-weight model.

Input: $2.0000 / 1K tokens
Output: $6.0000 / 1K tokens

Alibaba Qwen 3 (128k)

2026-01 • 128K context window

Alibaba's latest Qwen 3 flagship model with strong multilingual capabilities and improved reasoning. Competitive with frontier models at lower cost.

Input: $0.8000 / 1K tokens
Output: $3.2000 / 1K tokens

Midjourney v7

2026-01 • N/A context window

Professional AI image generation with photorealistic quality and artistic control. Subscription-based: Basic $10/mo, Standard $30/mo, Pro $60/mo

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

OpenAI GPT-5.2 Pro (400k)

2025-12 • 400K context window

Higher-compute variant of GPT-5.2 for harder problems (Responses API only).

Input: $21.0000 / 1K tokens
Output: $168.0000 / 1K tokens

Google Gemini 3 Flash Preview (1M)

2025-12 • 1M context window

Gemini 3 Flash Preview on Vertex AI.

Input: $0.5000 / 1K tokens
Output: $3.0000 / 1K tokens

Anthropic Claude Opus 4.5 (200k)

2025-11 • 200K context window

Claude 4.5 flagship Opus model for long-horizon coding and agentic workflows.

Input: $5.0000 / 1K tokens
Output: $25.0000 / 1K tokens

Google Gemini 3 Pro Preview (1M)

2025-11 • 1M context window

Gemini 3 Pro Preview on Vertex AI.

Input: $2.0000 / 1K tokens
Output: $12.0000 / 1K tokens

Anthropic Claude Haiku 4.5 (200k)

2025-10 • 200K context window

Anthropic's fastest 4.5-series model�cost-effective for high-volume workloads with extended thinking and strong tool use.

Input: $1.0000 / 1K tokens
Output: $5.0000 / 1K tokens

Cognition SWE-1.5 (Windsurf in-house)

2025-10 • Unknown context window

Windsurf/Cognition in-house frontier model for agentic coding. Consumed via Windsurf prompt credits (not USD per-token).

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Anthropic Claude Haiku 4.5 (200k)

2025-10 • 200K context window

Anthropic's fastest and most cost-effective 4.5-series model, optimized for high-volume workloads with strong tool use capabilities.

Input: $0.8000 / 1K tokens
Output: $4.0000 / 1K tokens

Anthropic Claude Sonnet 4.5 (200k)

2025-09-30 • 200K context window

Anthropic's latest and most advanced Sonnet model released September 30, 2025. Features dramatically improved coding capabilities, enhanced reasoning, and better long-context performance. Outperforms Claude 4 Sonnet on all major benchmarks while maintaining cost efficiency.

Input: $3.5000 / 1K tokens
Output: $17.5000 / 1K tokens

OpenAI GPT-5-Codex (400k)

2025-09 • 400K context window

A version of GPT-5 optimized for agentic coding in Codex. Default in Codex cloud & reviews; also usable via API key. Priced the same as GPT-5.

Input: $1.2500 / 1K tokens
Output: $10.0000 / 1K tokens

Anthropic Claude Sonnet 4.5 (200k/1M beta)

2025-09 • 1M context window

Anthropic's most intelligent model for agents and coding, with extended thinking and state-of-the-art performance on SWE-bench Verified.

Input: $3.0000 / 1K tokens
Output: $15.0000 / 1K tokens

Google Gemini 3.0 Ultra (2M)

2025-09 • 2M context window

Google's next-generation flagship model with breakthrough multimodal capabilities and 2M token context window. Features advanced reasoning, native code execution, and real-time multimodal understanding across text, image, audio, and video.

Input: $0.2000 / 1K tokens
Output: $0.8000 / 1K tokens

Google Gemini 3.0 (2M)

2025-09 • 2M context window

Next-generation Gemini with advanced multimodal reasoning and long context. Rates pending official pricing page.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Anthropic Claude Opus 4.1 (200k)

2025-08 • 200K context window

Opus 4.1 (200k) � higher-cost flagship Opus tier. Newer Opus 4.5 provides a more accessible price point.

Input: $15.0000 / 1K tokens
Output: $75.0000 / 1K tokens

OpenAI GPT-5 Pro (400k)

2025-08 • 400K context window

Version of GPT-5 that produces smarter and more precise responses. Responses API only; higher max output than standard GPT-5.

Input: $15.0000 / 1K tokens
Output: $120.0000 / 1K tokens

OpenAI GPT-5 (400k)

2025-08 • 400K context window

Previous flagship GPT model for coding, reasoning, and agentic tasks. OpenAI recommends GPT-5.1/5.2 for newest improvements.

Input: $1.2500 / 1K tokens
Output: $10.0000 / 1K tokens

OpenAI GPT-5 Mini (400k)

2025-08 • 400K context window

A faster, lower-cost GPT-5 for well-defined tasks. Text & vision with long context at a fraction of the price.

Input: $0.2500 / 1K tokens
Output: $2.0000 / 1K tokens

OpenAI GPT-5 Low (1M)

2025-08 • 1M context window

Alias tier aligned with GPT-5 Mini pricing; use when your workflow targets the "low" cost tier.

Input: $0.2500 / 1K tokens
Output: $2.0000 / 1K tokens

OpenAI GPT-5 Medium (1M)

2025-08 • 1M context window

Mid-tier GPT-5 variant balancing performance and cost for general-purpose workloads.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

OpenAI GPT-5 High (1M)

2025-08 • 1M context window

High-tier GPT-5 with enhanced reasoning, reliability and coding for mission-critical workloads.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Moonshot Kimi K2 Base (128k)

2025-07 • 128K context window

A 1T parameter open-weight Mixture-of-Experts (MoE) model with 32B active parameters. This is the unaligned, pre-trained base model, suitable for further fine-tuning.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Moonshot Kimi K2 Instruct (128k)

2025-07 • 128K context window

The instruction-tuned version of Kimi K2, optimized for chat, agentic tasks, and tool use. Aligned with RLHF for helpful and safe responses.

Input: $0.1500 / 1K tokens
Output: $2.5000 / 1K tokens

Alibaba Qwen3 Coder Flash (1M)

2025-07 • 1M context window

A 30B parameter model from the Qwen3 series, excelling in coding and agentic tasks with a 1M token context length.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Zhipu GLM-4.5 (200k)

2025-07 • 200K context window

GLM-4.5 agentic foundation model. Official pricing is published in RMB on BigModel; USD prices vary by provider.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Alibaba Qwen3 Coder (1M)

2025-07 • 1M context window

Qwen3 coding-specialized model with long-context capabilities and strong tool-use.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Anthropic Claude Opus 4 (200k)

2025-05 • 200K context window

Anthropic's most powerful model, excelling in coding, advanced reasoning, and AI agent workflows. Handles complex, long-running tasks.

Input: $15.0000 / 1K tokens
Output: $75.0000 / 1K tokens

Anthropic Claude Sonnet 4 (200k)

2025-05 • 200K context window

Anthropic's highly capable and versatile model, offering a strong balance of intelligence, speed, and cost-effectiveness for enterprise applications.

Input: $3.0000 / 1K tokens
Output: $15.0000 / 1K tokens

Google Gemini 2.5 Pro (1M)

2025-05 • 1M context window

Google's most advanced reasoning Gemini model, capable of solving complex problems. Supports text, code, image, audio, and video inputs. Features a 1M token context window (up to 2M in some versions).

Input: $1.2500 / 1K tokens
Output: $10.0000 / 1K tokens

Mistral Medium 3 (128k)

2025-05 • 128K context window

Mistral AI's frontier-class multimodal model balancing SOTA performance, lower cost, and simpler deployability for enterprise usage. Excels in coding and multimodal understanding.

Input: $0.4000 / 1K tokens
Output: $2.0000 / 1K tokens

Mistral Devstral Small (128k)

2025-05 • 128K context window

A 24B open-source text model from Mistral AI that excels at using tools to explore codebases, editing multiple files, and powering software engineering agents. Apache 2.0 license.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Google Gemini 2.5 Flash (1M)

2025-05 • 1M context window

Google's best model for price and performance (as of May 2025), featuring hybrid reasoning capabilities. Supports text, code, image, audio, and video inputs. 1M token context window.

Input: $0.1500 / 1K tokens
Output: $0.6000 / 1K tokens

OpenAI Codex Mini (200k)

2025-05 • 200K context window

OpenAI's fast coding model designed for the Codex coding agent. Optimized for rapid code generation, editing, and review within development workflows.

Input: $1.5000 / 1K tokens
Output: $6.0000 / 1K tokens

Pika 2

2025-05 • N/A context window

Fast and creative video generation with emphasis on artistic styles and special effects. Free tier available with paid options.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

OpenAI o3 (200k)

2025-04 • 200K context window

OpenAI reasoning model for complex tasks (text+image input, text output). Succeeded by GPT-5.x for many agentic workloads.

Input: $2.0000 / 1K tokens
Output: $8.0000 / 1K tokens

OpenAI o4-mini (200k)

2025-04 • 200K context window

A faster, cost-efficient reasoning model, successor to o3-mini, released in April 2025. Offers strong performance on math, coding, and vision. Can process text and images, and features autonomous tool use.

Input: $1.1000 / 1K tokens
Output: $4.4000 / 1K tokens

Alibaba Qwen3 235B MoE (128k)

2025-04 • 128K context window

Alibaba's flagship Qwen3 Mixture-of-Experts model with 235B total parameters (22B active). Features hybrid reasoning and supports 119 languages. (Note: Not publicly available at release).

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Alibaba Qwen3 30B MoE (128k)

2025-04 • 128K context window

Alibaba's Qwen3 Mixture-of-Experts model with 30B total parameters (3B active). Features hybrid reasoning and supports 119 languages. Apache 2.0 license.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Alibaba Qwen3 32B Dense (128k)

2025-04 • 128K context window

Alibaba's largest dense model in the Qwen3 family with 32B parameters. Features hybrid reasoning and supports 119 languages. Apache 2.0 license.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

OpenAI GPT-4.1 (400k)

2025-04 • 400K context window

Smartest non-reasoning GPT model (text+image in, text out).

Input: $2.0000 / 1K tokens
Output: $8.0000 / 1K tokens

OpenAI GPT-4.1 mini (Unknown context)

2025-04 • Unknown context window

Smaller, faster GPT-4.1 tier with low cost.

Input: $0.4000 / 1K tokens
Output: $1.6000 / 1K tokens

OpenAI GPT-4.1 nano (Unknown context)

2025-04 • Unknown context window

Fastest, cheapest GPT-4.1 tier.

Input: $0.1000 / 1K tokens
Output: $0.4000 / 1K tokens

Google Gemini 2.5 Flash Preview (1M)

2025-04 • 1M context window

Preview of Google's Gemini 2.5 Flash with hybrid reasoning capabilities. Ultra-fast and cost-efficient for high-volume applications.

Input: $0.1500 / 1K tokens
Output: $0.6000 / 1K tokens

Meta Llama 4 Maverick (1M)

2025-04 • 1M context window

Meta's large Llama 4 Mixture-of-Experts model with 400B total parameters (17B active per expert, 128 experts). Natively multimodal with a 1M token context window. Open source.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Meta Llama 4 Scout (10M)

2025-04 • 10M context window

Meta's efficient Llama 4 model with an industry-leading 10M token context window. 109B total parameters (17B active per expert, 16 experts). Natively multimodal and open source.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Meta Llama 4 Maverick (1M)

2025-04 • 1M context window

Meta's Llama 4 Maverick with mixture-of-experts architecture, 1M context window, and strong multilingual support. Open-source model.

Input: $0.2000 / 1K tokens
Output: $0.6000 / 1K tokens

Meta Llama 4 Scout (10M)

2025-04 • 10M context window

Meta's Llama 4 Scout with an industry-leading 10M token context window and 16 experts MoE architecture. Optimized for efficiency.

Input: $0.1000 / 1K tokens
Output: $0.3000 / 1K tokens

Mistral Small 3.1 (128k)

2025-03 • 128K context window

A new leader in the small models category by Mistral AI, with image understanding capabilities and an extended 128k context length. Apache 2.0 license.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

OpenAI GPT-4o with Search (128k)

2025-03 • 128K context window

GPT-4o variant with built-in web search grounding. Provides up-to-date, cited answers by searching the web in real time.

Input: $2.5000 / 1K tokens
Output: $10.0000 / 1K tokens

Google Gemini 2.5 Pro Preview (1M)

2025-03 • 1M context window

Early preview of Google's Gemini 2.5 Pro thinking model. Excels at reasoning, coding, and multimodal tasks with a 1M token context window.

Input: $1.2500 / 1K tokens
Output: $10.0000 / 1K tokens

Cohere Command A (256k)

2025-03 • 256K context window

Cohere's next-generation enterprise model with an expanded 256K context window. Optimized for agentic RAG, tool use, and structured outputs.

Input: $2.5000 / 1K tokens
Output: $10.0000 / 1K tokens

Google Gemma 3 27B (128k)

2025-03 • 128K context window

Google's open-source 27B parameter model from the Gemma 3 family. Natively multimodal with strong performance on text, image, and video tasks. Free to use under open license.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Anthropic Claude 3.7 Sonnet (Deprecated, 200k)

2025-02 • 200K context window

Hybrid reasoning Claude 3.x model (extended thinking). Deprecated; recommended replacement is Claude Sonnet 4.5.

Input: $3.0000 / 1K tokens
Output: $15.0000 / 1K tokens

OpenAI o3-mini (200k)

2025-02 • 200K context window

A faster, more cost-effective version of o3, released in January 2025. Offers strong reasoning, coding, and vision capabilities. Optimized for math and coding tasks.

Input: $1.1000 / 1K tokens
Output: $4.4000 / 1K tokens

Mistral Saba (32k)

2025-02 • 32K context window

A powerful and efficient model from Mistral AI for languages from the Middle East and South Asia.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

xAI Grok 3 (128k)

2025-02 • 128K context window

xAI's flagship large language model with strong reasoning, coding, and math capabilities. Trained on the Colossus supercluster.

Input: $3.0000 / 1K tokens
Output: $15.0000 / 1K tokens

xAI Grok 3 Mini (128k)

2025-02 • 128K context window

xAI's lightweight reasoning model with think mode. Faster and more cost-efficient than Grok 3 while maintaining strong reasoning capabilities.

Input: $0.3000 / 1K tokens
Output: $0.5000 / 1K tokens

DeepSeek 3.1 (128k)

2025-01 • 128K context window

DeepSeek's enhanced model with improved reasoning capabilities, expanded context window, and even more competitive pricing.

Input: $0.1200 / 1K tokens
Output: $0.2400 / 1K tokens

Mistral Codestral 2 (256k)

2025-01 • 256K context window

Mistral AI's cutting-edge language model for coding (second version). Specializes in low-latency, high-frequency tasks like fill-in-the-middle (FIM), code correction, and test generation.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

DeepSeek R1 (128k)

2025-01 • 128K context window

DeepSeek's reasoning model trained with reinforcement learning. Excels at math, coding, and complex reasoning tasks with transparent chain-of-thought.

Input: $0.5500 / 1K tokens
Output: $2.1900 / 1K tokens

Google Gemini 2.0 Flash (1M)

2024-12 • 1M context window

Google's latest experimental model with breakthrough multimodal capabilities and enhanced reasoning at an extremely competitive price point.

Input: $0.0750 / 1K tokens
Output: $0.3000 / 1K tokens

Meta Llama 3.3 70B Instruct

2024-12 • 128K context window

Meta's latest 70B parameter model with improved performance and capabilities, offering state-of-the-art results for its size.

Input: $0.6000 / 1K tokens
Output: $0.6000 / 1K tokens

DeepSeek V3 (64k)

2024-12 • 64K context window

DeepSeek's latest model with strong performance across reasoning, coding, and general tasks at competitive pricing.

Input: $0.1400 / 1K tokens
Output: $0.2800 / 1K tokens

OpenAI o1 (200k)

2024-12 • 200K context window

Previous full o-series reasoning model (text+image in, text out).

Input: $15.0000 / 1K tokens
Output: $60.0000 / 1K tokens

OpenAI o1-pro (200k)

2024-12 • 200K context window

Higher-compute variant of o1 for better responses.

Input: $150.0000 / 1K tokens
Output: $600.0000 / 1K tokens

Microsoft Phi-4 (16k)

2024-12 • 16K context window

Microsoft's small language model with 14B parameters that punches well above its weight. Excels at STEM reasoning and coding despite its compact size. Open source under MIT license.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Amazon Nova Pro (300k)

2024-12 • 300K context window

Amazon's highly capable multimodal model balancing accuracy, speed, and cost. Processes text, images, and video inputs for a wide range of enterprise tasks.

Input: $0.8000 / 1K tokens
Output: $3.2000 / 1K tokens

Amazon Nova Lite (300k)

2024-12 • 300K context window

Amazon's very low-cost multimodal model for high-volume tasks. Processes text, images, and video at extremely competitive pricing via Amazon Bedrock.

Input: $0.0600 / 1K tokens
Output: $0.2400 / 1K tokens

OpenAI Sora Turbo

2024-12 • N/A context window

Advanced text-to-video generation with realistic motion and scene understanding. Pricing per video generation varies by length and resolution.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Anthropic Claude 3.5 Sonnet (200k)

2024-10 • 200K context window

Anthropic's most advanced model, significantly improved over Claude 3 Sonnet with enhanced reasoning, coding, and vision capabilities.

Input: $3.0000 / 1K tokens
Output: $15.0000 / 1K tokens

Anthropic Claude 3.5 Haiku (Deprecated, 200k)

2024-10 • 200K context window

Claude 3.5 Haiku snapshot. Deprecated; recommended replacement is Haiku 4.5.

Input: $0.8000 / 1K tokens
Output: $4.0000 / 1K tokens

OpenAI o1-preview (Deprecated, 128k)

2024-09 • 128K context window

Deprecated preview snapshot of OpenAI's first o-series reasoning model. Kept for backwards compatibility; prefer o1 for production.

Input: $15.0000 / 1K tokens
Output: $60.0000 / 1K tokens

OpenAI o1-mini (128k)

2024-09 • 128K context window

Smaller, faster o-series reasoning model. Deprecated in favor of newer reasoning models, but still supported in legacy workflows.

Input: $1.1000 / 1K tokens
Output: $4.4000 / 1K tokens

Meta Llama 3.2 90B Vision Instruct

2024-09 • 128K context window

Meta's multimodal model combining text and vision capabilities with strong performance across various tasks.

Input: $1.2000 / 1K tokens
Output: $1.2000 / 1K tokens

Meta Llama 3.2 11B Vision Instruct

2024-09 • 128K context window

A smaller, efficient multimodal model from Meta with vision capabilities, suitable for edge deployment and cost-sensitive applications.

Input: $0.1800 / 1K tokens
Output: $0.1800 / 1K tokens

Mistral Small (32k)

2024-09 • 32K context window

Mistral AI's cost-effective model for straightforward tasks, offering good performance and efficiency.

Input: $0.2000 / 1K tokens
Output: $0.6000 / 1K tokens

Qwen 2.5 72B Instruct

2024-09 • 32K context window

Alibaba's large language model with strong performance in reasoning, coding, and multilingual tasks.

Input: $0.5600 / 1K tokens
Output: $0.5600 / 1K tokens

AI21 Jamba 1.5 Large (256k)

2024-08 • 256K context window

AI21's large hybrid SSM-Transformer model with a 256K context window. Uses a novel Jamba architecture combining Mamba SSM layers with Transformer attention for efficient long-context processing.

Input: $2.0000 / 1K tokens
Output: $8.0000 / 1K tokens

OpenAI GPT-4o mini (128k)

2024-07 • 128K context window

OpenAI's most affordable and fastest model in the GPT-4o family, designed for high-volume, low-latency tasks.

Input: $0.1500 / 1K tokens
Output: $0.6000 / 1K tokens

Meta Llama 3.1 405B Instruct

2024-07 • 128K context window

Meta's largest and most capable Llama 3.1 model, designed for complex reasoning, coding, and nuanced instruction following.

Input: $2.7000 / 1K tokens
Output: $2.7000 / 1K tokens

Meta Llama 3.1 70B Instruct

2024-07 • 128K context window

A large instruction-tuned model from Meta's Llama 3.1 series, offering a strong balance of performance and efficiency for a wide range of tasks.

Input: $0.6000 / 1K tokens
Output: $0.6000 / 1K tokens

Meta Llama 3.1 8B Instruct

2024-07 • 128K context window

A highly efficient instruction-tuned model from Meta's Llama 3.1 series, suitable for fast, on-device, or edge applications.

Input: $0.0600 / 1K tokens
Output: $0.0600 / 1K tokens

Mistral Large 2 (128k)

2024-07 • 128K context window

Mistral AI's flagship model with enhanced reasoning, coding, and multilingual capabilities.

Input: $2.0000 / 1K tokens
Output: $6.0000 / 1K tokens

Runway Gen-3 Alpha

2024-06 • N/A context window

Professional video generation and editing AI with motion control and style consistency. Subscription-based pricing.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

OpenAI GPT-4o (128k)

2024-05 • 128K context window

OpenAI's flagship multimodal model, natively processing text, audio, and images for faster, more capable interactions.

Input: $2.5000 / 1K tokens
Output: $10.0000 / 1K tokens

Google Gemini 1.5 Flash (1M)

2024-05 • 1M context window

Google's faster and lower-cost version of Gemini 1.5 Pro, optimized for high-volume, high-frequency tasks while retaining a large context window and multimodal capabilities.

Input: $0.0750 / 1K tokens
Output: $0.3000 / 1K tokens

Mistral Codestral (22B)

2024-05 • 32K context window

Mistral AI's open-weight generative model specialized for code generation, supporting 80+ languages.

Input: $0.2000 / 1K tokens
Output: $0.6000 / 1K tokens

OpenAI GPT-4 Turbo (128k)

2024-04 • 128K context window

OpenAI's powerful model prior to GPT-4o, with a large context window and strong performance on complex tasks. Supports vision.

Input: $10.0000 / 1K tokens
Output: $30.0000 / 1K tokens

Cohere Command R+ (128k)

2024-04 • 128K context window

Cohere's most powerful model optimized for enterprise RAG and tool use. Excels at grounded generation with citations and multi-step tool workflows.

Input: $2.5000 / 1K tokens
Output: $10.0000 / 1K tokens

Anthropic Claude 3 Opus (200k)

2024-03 • 200K context window

Claude 3 Opus (deprecated) � previous highest-intelligence Claude 3 model.

Input: $15.0000 / 1K tokens
Output: $75.0000 / 1K tokens

Anthropic Claude 3 Sonnet (200k)

2024-03 • 200K context window

A balanced model from Anthropic, offering a blend of intelligence and speed, ideal for enterprise workloads and scaled AI deployments.

Input: $3.0000 / 1K tokens
Output: $15.0000 / 1K tokens

Anthropic Claude 3 Haiku (200k)

2024-03 • 200K context window

Anthropic's fastest and most compact model, designed for near-instant responsiveness and high throughput tasks.

Input: $0.2500 / 1K tokens
Output: $1.2500 / 1K tokens

Google Gemini 1.5 Pro (2M)

2024-02 • 2M context window

Google's highly capable multimodal model with a breakthrough long context window of up to 2 million tokens. Excels at complex reasoning, problem-solving, and understanding long-form content.

Input: $1.2500 / 1K tokens
Output: $5.0000 / 1K tokens

Leonardo AI

2023-11 • N/A context window

Game asset and creative image generation with consistent character and style creation. Credit-based pricing system.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

OpenAI DALL-E 3

2023-10 • N/A context window

State-of-the-art text-to-image generation with improved prompt following and image quality. Pricing per image: HD 1024×1024 $0.040, Standard $0.020

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

Stable Diffusion XL

2023-07 • N/A context window

Open-source image generation model with fine-tuning capabilities. API pricing varies by provider, self-hosting available.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

OpenAI GPT-3.5 Turbo (16k)

2023-03 • 16K context window

OpenAI's fast, cost-effective model optimized for chat and simple tasks.

Input: $0.5000 / 1K tokens
Output: $1.5000 / 1K tokens

OpenAI GPT-5.2 (400k)

Unknown • 400K context window

OpenAI flagship model for coding and agentic tasks across industries.

Input: $1.7500 / 1K tokens
Output: $14.0000 / 1K tokens

OpenAI GPT-5.1 (400k)

Unknown • 400K context window

Flagship GPT model with configurable reasoning effort; predecessor to GPT-5.2.

Input: $1.7500 / 1K tokens
Output: $14.0000 / 1K tokens

OpenAI GPT-5 Nano (400k)

Unknown • 400K context window

Fastest, most cost-efficient GPT-5 variant.

Input: $0.0500 / 1K tokens
Output: $0.4000 / 1K tokens

OpenAI GPT-5.1-Codex (400k)

Unknown • 400K context window

GPT-5.1 variant optimized for agentic coding in Codex (Responses API only).

Input: $1.2500 / 1K tokens
Output: $10.0000 / 1K tokens

OpenAI GPT-5.1-Codex-Max (400k)

Unknown • 400K context window

Most intelligent Codex model optimized for long-horizon agentic coding (Responses API only).

Input: $1.2500 / 1K tokens
Output: $10.0000 / 1K tokens

OpenAI o3-pro (200k)

Unknown • 200K context window

Version of o3 with more compute for better responses (Responses API only).

Input: $20.0000 / 1K tokens
Output: $80.0000 / 1K tokens

OpenAI GPT-OSS 120B (Open-Weight)

Unknown • Unknown context window

Open-weight model entry as listed by OpenAI (see models page). Token costs depend on where you run it.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens

OpenAI GPT-OSS 20B (Open-Weight)

Unknown • Unknown context window

Open-weight model entry as listed by OpenAI (see models page). Token costs depend on where you run it.

Input: $0.0000 / 1K tokens
Output: $0.0000 / 1K tokens