LLM Models

Compare pricing and specifications for large language models from all major providers.

Alibaba Models

Qwen 2.5 72B Instruct

2024-09 • 32K context window

Alibaba's large language model with strong performance in reasoning, coding, and multilingual tasks.

Input: $0.5600 / 1K tokens

Output: $0.5600 / 1K tokens

View model details

Alibaba Qwen3 235B MoE (128k)

2025-04 • 128K context window

Alibaba's flagship Qwen3 Mixture-of-Experts model with 235B total parameters (22B active). Features hybrid reasoning and supports 119 languages. (Note: Not publicly available at release).

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Alibaba Qwen3 30B MoE (128k)

2025-04 • 128K context window

Alibaba's Qwen3 Mixture-of-Experts model with 30B total parameters (3B active). Features hybrid reasoning and supports 119 languages. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Alibaba Qwen3 32B Dense (128k)

2025-04 • 128K context window

Alibaba's largest dense model in the Qwen3 family with 32B parameters. Features hybrid reasoning and supports 119 languages. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Alibaba Qwen3 Coder Flash (1M)

2025-07 • 1M context window

A 30B parameter model from the Qwen3 series, excelling in coding and agentic tasks with a 1M token context length.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Models

Anthropic Claude 3.5 Sonnet (200k)

2024-10 • 200K context window

Anthropic's most advanced model, significantly improved over Claude 3 Sonnet with enhanced reasoning, coding, and vision capabilities.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude 3.7 Sonnet (200k)

2024-10 • 200K context window

Anthropic's enhanced Claude model with improved reasoning capabilities and better performance on complex tasks, positioned between Claude 3.5 Sonnet and Claude 4.

Input: $4.0000 / 1K tokens

Output: $20.0000 / 1K tokens

View model details

Anthropic Claude 3 Opus (200k)

2024-03 • 200K context window

Anthropic's most powerful model prior to 3.5 Sonnet, excelling at highly complex tasks and demonstrating near-human levels of comprehension and fluency.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude 3 Sonnet (200k)

2024-03 • 200K context window

A balanced model from Anthropic, offering a blend of intelligence and speed, ideal for enterprise workloads and scaled AI deployments.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude 3 Haiku (200k)

2024-03 • 200K context window

Anthropic's fastest and most compact model, designed for near-instant responsiveness and high throughput tasks.

Input: $0.2500 / 1K tokens

Output: $1.2500 / 1K tokens

View model details

Anthropic Claude Opus 4 (200k)

2025-05 • 200K context window

Anthropic's most powerful model, excelling in coding, advanced reasoning, and AI agent workflows. Handles complex, long-running tasks.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4 (200k)

2025-05 • 200K context window

Anthropic's highly capable and versatile model, offering a strong balance of intelligence, speed, and cost-effectiveness for enterprise applications.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.1 (200k)

2025-08 • 200K context window

An upgraded version of Claude Opus 4, with focused improvements on reliability, autonomy, and contextual reasoning for real-world coding and agentic tasks.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude Opus 4 (200k)

2025-05 • 200K context window

Claude Opus 4 with stronger reasoning, coding, and improved latency over prior Opus line.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

DeepSeek Models

DeepSeek V3 (64k)

2024-12 • 64K context window

DeepSeek's latest model with strong performance across reasoning, coding, and general tasks at competitive pricing.

Input: $0.1400 / 1K tokens

Output: $0.2800 / 1K tokens

View model details

DeepSeek 3.1 (128k)

2025-01 • 128K context window

DeepSeek's enhanced model with improved reasoning capabilities, expanded context window, and even more competitive pricing.

Input: $0.1200 / 1K tokens

Output: $0.2400 / 1K tokens

View model details

Google Models

Google Gemini 2.0 Flash (1M)

2024-12 • 1M context window

Google's latest experimental model with breakthrough multimodal capabilities and enhanced reasoning at an extremely competitive price point.

Input: $0.0750 / 1K tokens

Output: $0.3000 / 1K tokens

View model details

Google Gemini 1.5 Pro (2M)

2024-02 • 2M context window

Google's highly capable multimodal model with a breakthrough long context window of up to 2 million tokens. Excels at complex reasoning, problem-solving, and understanding long-form content.

Input: $1.2500 / 1K tokens

Output: $5.0000 / 1K tokens

View model details

Google Gemini 1.5 Flash (1M)

2024-05 • 1M context window

Google's faster and lower-cost version of Gemini 1.5 Pro, optimized for high-volume, high-frequency tasks while retaining a large context window and multimodal capabilities.

Input: $0.0750 / 1K tokens

Output: $0.3000 / 1K tokens

View model details

Google's most advanced reasoning Gemini model, capable of solving complex problems. Supports text, code, image, audio, and video inputs. Features a 1M token context window (up to 2M in some versions).

Input: $1.2500 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

Google Gemini 2.5 Flash (1M)

2025-05 • 1M context window

Google's best model for price and performance (as of May 2025), featuring hybrid reasoning capabilities. Supports text, code, image, audio, and video inputs. 1M token context window.

Input: $0.1500 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Models

Meta Llama 3.3 70B Instruct

2024-12 • 128K context window

Meta's latest 70B parameter model with improved performance and capabilities, offering state-of-the-art results for its size.

Input: $0.6000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 3.2 90B Vision Instruct

2024-09 • 128K context window

Meta's multimodal model combining text and vision capabilities with strong performance across various tasks.

Input: $1.2000 / 1K tokens

Output: $1.2000 / 1K tokens

View model details

Meta Llama 3.2 11B Vision Instruct

2024-09 • 128K context window

A smaller, efficient multimodal model from Meta with vision capabilities, suitable for edge deployment and cost-sensitive applications.

Input: $0.1800 / 1K tokens

Output: $0.1800 / 1K tokens

View model details

Meta Llama 3.1 405B Instruct

2024-07 • 128K context window

Meta's largest and most capable Llama 3.1 model, designed for complex reasoning, coding, and nuanced instruction following.

Input: $2.7000 / 1K tokens

Output: $2.7000 / 1K tokens

View model details

Meta Llama 3.1 70B Instruct

2024-07 • 128K context window

A large instruction-tuned model from Meta's Llama 3.1 series, offering a strong balance of performance and efficiency for a wide range of tasks.

Input: $0.6000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Meta Llama 3.1 8B Instruct

2024-07 • 128K context window

A highly efficient instruction-tuned model from Meta's Llama 3.1 series, suitable for fast, on-device, or edge applications.

Input: $0.0600 / 1K tokens

Output: $0.0600 / 1K tokens

View model details

Mistral AI Models

Mistral Large 2 (128k)

2024-07 • 128K context window

Mistral AI's flagship model with enhanced reasoning, coding, and multilingual capabilities.

Input: $2.0000 / 1K tokens

Output: $6.0000 / 1K tokens

View model details

Mistral Small (32k)

2024-09 • 32K context window

Mistral AI's cost-effective model for straightforward tasks, offering good performance and efficiency.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Mistral Codestral (22B)

2024-05 • 32K context window

Mistral AI's open-weight generative model specialized for code generation, supporting 80+ languages.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Mistral Medium 3 (128k)

2025-05 • 128K context window

Mistral AI's frontier-class multimodal model balancing SOTA performance, lower cost, and simpler deployability for enterprise usage. Excels in coding and multimodal understanding.

Input: $0.4000 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

Mistral Codestral 2 (256k)

2025-01 • 256K context window

Mistral AI's cutting-edge language model for coding (second version). Specializes in low-latency, high-frequency tasks like fill-in-the-middle (FIM), code correction, and test generation.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Mistral Small 3.1 (128k)

2025-03 • 128K context window

A new leader in the small models category by Mistral AI, with image understanding capabilities and an extended 128k context length. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Mistral Devstral Small (128k)

2025-05 • 128K context window

A 24B open-source text model from Mistral AI that excels at using tools to explore codebases, editing multiple files, and powering software engineering agents. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Mistral Saba (32k)

2025-02 • 32K context window

A powerful and efficient model from Mistral AI for languages from the Middle East and South Asia.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Moonshot AI Models

Moonshot Kimi K2 Base (128k)

2025-07 • 128K context window

A 1T parameter open-weight Mixture-of-Experts (MoE) model with 32B active parameters. This is the unaligned, pre-trained base model, suitable for further fine-tuning.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Moonshot Kimi K2 Instruct (128k)

2025-07 • 128K context window

The instruction-tuned version of Kimi K2, optimized for chat, agentic tasks, and tool use. Aligned with RLHF for helpful and safe responses.

Input: $0.1500 / 1K tokens

Output: $2.5000 / 1K tokens

View model details

OpenAI Models

OpenAI o1-Preview (128k)

2024-09 • 128K context window

OpenAI's most advanced reasoning model, designed for complex problem-solving with enhanced chain-of-thought capabilities.

Input: $15.0000 / 1K tokens

Output: $60.0000 / 1K tokens

View model details

OpenAI o1-Mini (128k)

2024-09 • 128K context window

A faster, more cost-effective version of o1 with strong reasoning capabilities, optimized for coding and STEM tasks.

Input: $3.0000 / 1K tokens

Output: $12.0000 / 1K tokens

View model details

OpenAI GPT-4o (128k)

2024-05 • 128K context window

OpenAI's flagship multimodal model, natively processing text, audio, and images for faster, more capable interactions.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

OpenAI GPT-4o mini (128k)

2024-07 • 128K context window

OpenAI's most affordable and fastest model in the GPT-4o family, designed for high-volume, low-latency tasks.

Input: $0.1500 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

OpenAI GPT-4 Turbo (128k)

2024-04 • 128K context window

OpenAI's powerful model prior to GPT-4o, with a large context window and strong performance on complex tasks. Supports vision.

Input: $10.0000 / 1K tokens

Output: $30.0000 / 1K tokens

View model details

OpenAI GPT-3.5 Turbo (16k)

2023-03 • 16K context window

OpenAI's fast, cost-effective model optimized for chat and simple tasks.

Input: $0.5000 / 1K tokens

Output: $1.5000 / 1K tokens

View model details

OpenAI o3 (200k)

2025-04 • 200K context window

OpenAI's advanced reasoning model released in April 2025, successor to o1-preview. Features strong performance in complex problem-solving, coding, and handles both text and image inputs. Includes autonomous tool use.

Input: $1.6500 / 1K tokens

Output: $6.6000 / 1K tokens

View model details

OpenAI o3-mini (200k)

2025-01 • 200K context window

A faster, more cost-effective version of o3, released in January 2025. Offers strong reasoning, coding, and vision capabilities. Optimized for math and coding tasks.

Input: $0.2750 / 1K tokens

Output: $1.1000 / 1K tokens

View model details

OpenAI o4-mini (200k)

2025-04 • 200K context window

A faster, cost-efficient reasoning model, successor to o3-mini, released in April 2025. Offers strong performance on math, coding, and vision. Can process text and images, and features autonomous tool use.

Input: $1.1000 / 1K tokens

Output: $4.4000 / 1K tokens

View model details

OpenAI GPT-5 (1M)

2025-08 • 1M context window

OpenAI's unified flagship model integrating advanced reasoning and multimodal understanding across text, code, image, audio, and video. Features a unified architecture that intelligently routes requests for optimal performance.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI GPT-5 'Lobster'

2025-08 • 1M context window

A specialized, programming-focused variant of GPT-5, designed for advanced debugging, code generation, and application development.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Anthropic Models

Anthropic Claude 3.5 Sonnet (200k)

2024-10 • 200K context window

Anthropic's most advanced model, significantly improved over Claude 3 Sonnet with enhanced reasoning, coding, and vision capabilities.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude 3.7 Sonnet (200k)

2024-10 • 200K context window

Anthropic's enhanced Claude model with improved reasoning capabilities and better performance on complex tasks, positioned between Claude 3.5 Sonnet and Claude 4.

Input: $4.0000 / 1K tokens

Output: $20.0000 / 1K tokens

View model details

Anthropic Claude 3 Opus (200k)

2024-03 • 200K context window

Anthropic's most powerful model prior to 3.5 Sonnet, excelling at highly complex tasks and demonstrating near-human levels of comprehension and fluency.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude 3 Sonnet (200k)

2024-03 • 200K context window

A balanced model from Anthropic, offering a blend of intelligence and speed, ideal for enterprise workloads and scaled AI deployments.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude 3 Haiku (200k)

2024-03 • 200K context window

Anthropic's fastest and most compact model, designed for near-instant responsiveness and high throughput tasks.

Input: $0.2500 / 1K tokens

Output: $1.2500 / 1K tokens

View model details

Anthropic Claude Opus 4 (200k)

2025-05 • 200K context window

Anthropic's most powerful model, excelling in coding, advanced reasoning, and AI agent workflows. Handles complex, long-running tasks.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude Sonnet 4 (200k)

2025-05 • 200K context window

Anthropic's highly capable and versatile model, offering a strong balance of intelligence, speed, and cost-effectiveness for enterprise applications.

Input: $3.0000 / 1K tokens

Output: $15.0000 / 1K tokens

View model details

Anthropic Claude Opus 4.1 (200k)

2025-08 • 200K context window

An upgraded version of Claude Opus 4, with focused improvements on reliability, autonomy, and contextual reasoning for real-world coding and agentic tasks.

Input: $15.0000 / 1K tokens

Output: $75.0000 / 1K tokens

View model details

Anthropic Claude Opus 4 (200k)

2025-05 • 200K context window

Claude Opus 4 with stronger reasoning, coding, and improved latency over prior Opus line.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Mistral AI Models

Mistral Large 2 (128k)

2024-07 • 128K context window

Mistral AI's flagship model with enhanced reasoning, coding, and multilingual capabilities.

Input: $2.0000 / 1K tokens

Output: $6.0000 / 1K tokens

View model details

Mistral Small (32k)

2024-09 • 32K context window

Mistral AI's cost-effective model for straightforward tasks, offering good performance and efficiency.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Mistral Codestral (22B)

2024-05 • 32K context window

Mistral AI's open-weight generative model specialized for code generation, supporting 80+ languages.

Input: $0.2000 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

Mistral Medium 3 (128k)

2025-05 • 128K context window

Mistral AI's frontier-class multimodal model balancing SOTA performance, lower cost, and simpler deployability for enterprise usage. Excels in coding and multimodal understanding.

Input: $0.4000 / 1K tokens

Output: $2.0000 / 1K tokens

View model details

Mistral Codestral 2 (256k)

2025-01 • 256K context window

Mistral AI's cutting-edge language model for coding (second version). Specializes in low-latency, high-frequency tasks like fill-in-the-middle (FIM), code correction, and test generation.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Mistral Small 3.1 (128k)

2025-03 • 128K context window

A new leader in the small models category by Mistral AI, with image understanding capabilities and an extended 128k context length. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Mistral Devstral Small (128k)

2025-05 • 128K context window

A 24B open-source text model from Mistral AI that excels at using tools to explore codebases, editing multiple files, and powering software engineering agents. Apache 2.0 license.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

Mistral Saba (32k)

2025-02 • 32K context window

A powerful and efficient model from Mistral AI for languages from the Middle East and South Asia.

Input: $0.0000 / 1K tokens

Output: $0.0000 / 1K tokens

View model details

OpenAI Models

OpenAI o1-Preview (128k)

2024-09 • 128K context window

OpenAI's most advanced reasoning model, designed for complex problem-solving with enhanced chain-of-thought capabilities.

Input: $15.0000 / 1K tokens

Output: $60.0000 / 1K tokens

View model details

OpenAI o1-Mini (128k)

2024-09 • 128K context window

A faster, more cost-effective version of o1 with strong reasoning capabilities, optimized for coding and STEM tasks.

Input: $3.0000 / 1K tokens

Output: $12.0000 / 1K tokens

View model details

OpenAI GPT-4o (128k)

2024-05 • 128K context window

OpenAI's flagship multimodal model, natively processing text, audio, and images for faster, more capable interactions.

Input: $2.5000 / 1K tokens

Output: $10.0000 / 1K tokens

View model details

OpenAI GPT-4o mini (128k)

2024-07 • 128K context window

OpenAI's most affordable and fastest model in the GPT-4o family, designed for high-volume, low-latency tasks.

Input: $0.1500 / 1K tokens

Output: $0.6000 / 1K tokens

View model details

OpenAI GPT-4 Turbo (128k)

2024-04 • 128K context window

OpenAI's powerful model prior to GPT-4o, with a large context window and strong performance on complex tasks. Supports vision.

Input: $10.0000 / 1K tokens

Output: $30.0000 / 1K tokens

View model details

OpenAI GPT-3.5 Turbo (16k)

2023-03 • 16K context window

OpenAI's fast, cost-effective model optimized for chat and simple tasks.

Input: $0.5000 / 1K tokens

Output: $1.5000 / 1K tokens

View model details

OpenAI o3 (200k)

2025-04 • 200K context window

Input: $1.6500 / 1K tokens

Output: $6.6000 / 1K tokens

View model details

OpenAI o3-mini (200k)

2025-01 • 200K context window

A faster, more cost-effective version of o3, released in January 2025. Offers strong reasoning, coding, and vision capabilities. Optimized for math and coding tasks.

Input: $0.2750 / 1K tokens

Output: $1.1000 / 1K tokens

View model details