AI Models

235 models Free & Paid Cập nhật: 2 hours trước

The o-series of models are trained with reinforcement learning to think before they answer and perform complex reasoning. The o3-pro model uses more compute to think harder and provide consistently...

by |6月 2025 |200K context |$20.00/M input |$80.00/M output
200K tokens

A lightweight model that thinks before responding. Fast, smart, and great for logic-based tasks that do not require deep domain knowledge. The raw thinking traces are accessible.

by |6月 2025 |131K context |$0.3000/M input |$0.5000/M output
131K tokens

Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in...

by |6月 2025 |131K context |$3.00/M input |$15.00/M output
131K tokens

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

by |6月 2025 |1M context |$1.25/M input |$10.00/M output
1M tokens

May 28th update to the [original DeepSeek R1](/deepseek/deepseek-r1) Performance on par with [OpenAI o1](/openai/o1), but open-sourced and with fully open reasoning tokens. It's 671B parameters in size, with 37B active...

by |May 2025 |164K context |$0.5000/M input |$2.15/M output
164K tokens

Claude Opus 4 is benchmarked as the world’s best coding model, at time of release, bringing sustained performance on complex, long-running tasks and agent workflows. It sets new benchmarks in...

by |May 2025 |200K context |$15.00/M input |$75.00/M output
200K tokens

Claude Sonnet 4 significantly enhances the capabilities of its predecessor, Sonnet 3.7, excelling in both coding and reasoning tasks with improved precision and controllability. Achieving state-of-the-art performance on SWE-bench (72.7%),...

by |May 2025 |1M context |$3.00/M input |$15.00/M output
1M tokens

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks...

by |May 2025 |8K context |Miễn phí input |Miễn phí output

Gemma 3n E4B-it is optimized for efficient execution on mobile and low-resource devices, such as phones, laptops, and tablets. It supports multimodal inputs—including text, visual data, and audio—enabling diverse tasks...

by |May 2025 |33K context |$0.0200/M input |$0.0400/M output

Mistral Medium 3 is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances state-of-the-art reasoning and multimodal performance with 8× lower cost...

by |May 2025 |131K context |$0.4000/M input |$2.00/M output
131K tokens

Gemini 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, mathematics, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

by |May 2025 |1M context |$1.25/M input |$10.00/M output
1M tokens

Spotlight is a 7‑billion‑parameter vision‑language model derived from Qwen 2.5‑VL and fine‑tuned by Arcee AI for tight image‑text grounding tasks. It offers a 32 k‑token context window, enabling rich multimodal...

by |May 2025 |131K context |$0.1800/M input |$0.1800/M output
131K tokens

Maestro Reasoning is Arcee's flagship analysis model: a 32 B‑parameter derivative of Qwen 2.5‑32 B tuned with DPO and chain‑of‑thought RL for step‑by‑step logic. Compared to the earlier 7 B...

by |May 2025 |131K context |$0.9000/M input |$3.30/M output
131K tokens

Virtuoso‑Large is Arcee's top‑tier general‑purpose LLM at 72 B parameters, tuned to tackle cross‑domain reasoning, creative writing and enterprise QA. Unlike many 70 B peers, it retains the 128 k...

by |May 2025 |131K context |$0.7500/M input |$1.20/M output
131K tokens

Coder‑Large is a 32 B‑parameter offspring of Qwen 2.5‑Instruct that has been further trained on permissively‑licensed GitHub, CodeSearchNet and synthetic bug‑fix corpora. It supports a 32k context window, enabling multi‑file...

by |May 2025 |33K context |$0.5000/M input |$0.8000/M output

Mercury Coder is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like Claude 3.5 Haiku...

by |Apr 2025 |128K context |$0.2500/M input |$0.7500/M output
128K tokens

Llama Guard 4 is a Llama 4 Scout-derived multimodal pretrained model, fine-tuned for content safety classification. Similar to previous versions, it can be used to classify content in both LLM...

by |Apr 2025 |164K context |$0.1800/M input |$0.1800/M output
164K tokens

Qwen3, the latest generation in the Qwen large language model series, features both dense and mixture-of-experts (MoE) architectures to excel in reasoning, multilingual support, and advanced agent tasks. Its unique...

by |Apr 2025 |41K context |$0.0800/M input |$0.2800/M output

Qwen3-8B is a dense 8.2B parameter causal language model from the Qwen3 series, designed for both reasoning-heavy tasks and efficient dialogue. It supports seamless switching between "thinking" mode for math,...

by |Apr 2025 |41K context |$0.0500/M input |$0.4000/M output

Qwen3-14B is a dense 14.8B parameter causal language model from the Qwen3 series, designed for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...

by |Apr 2025 |41K context |$0.0600/M input |$0.2400/M output

Qwen3-32B is a dense 32.8B parameter causal language model from the Qwen3 series, optimized for both complex reasoning and efficient dialogue. It supports seamless switching between a "thinking" mode for...

by |Apr 2025 |41K context |$0.0800/M input |$0.2400/M output

Qwen3-235B-A22B is a 235B parameter mixture-of-experts (MoE) model developed by Qwen, activating 22B parameters per forward pass. It supports seamless switching between a "thinking" mode for complex reasoning, math, and...

by |Apr 2025 |131K context |$0.4550/M input |$1.82/M output
131K tokens

OpenAI o4-mini-high is the same model as [o4-mini](/openai/o4-mini) with reasoning_effort set to high. OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining...

by |Apr 2025 |200K context |$1.10/M input |$4.40/M output
200K tokens

o3 is a well-rounded and powerful model across domains. It sets a new standard for math, science, coding, and visual reasoning tasks. It also excels at technical writing and instruction-following....

by |Apr 2025 |200K context |$2.00/M input |$8.00/M output
200K tokens

OpenAI o4-mini is a compact reasoning model in the o-series, optimized for fast, cost-efficient performance while retaining strong multimodal and agentic capabilities. It supports tool use and demonstrates competitive reasoning...

by |Apr 2025 |200K context |$1.10/M input |$4.40/M output
200K tokens

GPT-4.1 is a flagship large language model optimized for advanced instruction following, real-world software engineering, and long-context reasoning. It supports a 1 million token context window and outperforms GPT-4o and...

by |Apr 2025 |1M context |$2.00/M input |$8.00/M output
1M tokens

GPT-4.1 Mini is a mid-sized model delivering performance competitive with GPT-4o at substantially lower latency and cost. It retains a 1 million token context window and scores 45.1% on hard...

by |Apr 2025 |1M context |$0.4000/M input |$1.60/M output
1M tokens

For tasks that demand low latency, GPT‑4.1 nano is the fastest and cheapest model in the GPT-4.1 series. It delivers exceptional performance at a small size with its 1 million...

by |Apr 2025 |1M context |$0.1000/M input |$0.4000/M output
1M tokens

A finetuned 7 billion parameters Code LLaMA - Instruct model to generate Solidity smart contract using 4-bit QLoRA finetuning provided by PEFT library.

by |Apr 2025 |4K context |$0.8000/M input |$1.20/M output

Grok 3 Mini is a lightweight, smaller thinking model. Unlike traditional models that generate answers immediately, Grok 3 Mini thinks before responding. It’s ideal for reasoning-heavy tasks that don’t demand...

by |Apr 2025 |131K context |$0.3000/M input |$0.5000/M output
131K tokens

Grok 3 is the latest model from xAI. It's their flagship model that excels at enterprise use cases like data extraction, coding, and text summarization. Possesses deep domain knowledge in...

by |Apr 2025 |131K context |$3.00/M input |$15.00/M output
131K tokens

Llama 4 Maverick 17B Instruct (128E) is a high-capacity multimodal language model from Meta, built on a mixture-of-experts (MoE) architecture with 128 experts and 17 billion active parameters per forward...

by |Apr 2025 |1M context |$0.1500/M input |$0.6000/M output
1M tokens

Llama 4 Scout 17B Instruct (16E) is a mixture-of-experts (MoE) language model developed by Meta, activating 17 billion parameters out of a total of 109B. It supports native multimodal input...

by |Apr 2025 |328K context |$0.0800/M input |$0.3000/M output
328K tokens

Qwen2.5-VL-32B is a multimodal vision-language model fine-tuned through reinforcement learning for enhanced mathematical reasoning, structured outputs, and visual problem-solving capabilities. It excels at visual analysis tasks, including object recognition, textual...

by |Mar 2025 |128K context |$0.2000/M input |$0.6000/M output
128K tokens

DeepSeek V3, a 685B-parameter, mixture-of-experts model, is the latest iteration of the flagship chat model family from the DeepSeek team. It succeeds the [DeepSeek V3](/deepseek/deepseek-chat-v3) model and performs really well...

by |Mar 2025 |164K context |$0.2000/M input |$0.7700/M output
164K tokens