AI Models

235 型号 Free & Paid 更新: 6 hours trước

xAI: Grok Code Fast 1

Grok Code Fast 1 is a speedy and economical reasoning model that excels at agentic coding. With reasoning traces visible in the response, developers can steer Grok Code for high-quality...

by x-ai |八月 2025 |256K context |$0.2000/M input |$1.50/M output

256K tokens ⓘ

Nous: Hermes 4 70乙

Hermes 4 70B is a hybrid reasoning model from Nous Research, built on Meta-Llama-3.1-70B. It introduces the same hybrid mode as the larger 405B release, allowing the model to either...

by nousresearch |八月 2025 |131K context |$0.1300/M input |$0.4000/M output

131K tokens ⓘ

Nous: Hermes 4 405乙

Hermes 4 is a large-scale reasoning model built on Meta-Llama-3.1-405B and released by Nous Research. It introduces a hybrid reasoning mode, where the model can choose to deliberate internally with...

by nousresearch |八月 2025 |131K context |$1.00/M input |$3.00/M output

131K tokens ⓘ

DeepSeek: DeepSeek V3.1

DeepSeek-V3.1 is a large hybrid reasoning model (671B parameters, 37B active) that supports both thinking and non-thinking modes via prompt templates. It extends the DeepSeek-V3 base with a two-phase long-context...

by deepseek |八月 2025 |33K context |$0.1500/M input |$0.7500/M output

开放人工智能: GPT-4o Audio

The gpt-4o-audio-preview model adds support for audio inputs as prompts. This enhancement allows the model to detect nuances within audio recordings and add depth to generated user experiences. Audio outputs...

by openai |八月 2025 |128K context |$2.50/M input |$10.00/M output

128K tokens ⓘ

Mistral: Mistral Medium 3.1

Mistral Medium 3.1 is an updated version of Mistral Medium 3, which is a high-performance enterprise-grade language model designed to deliver frontier-level capabilities at significantly reduced operational cost. It balances...

by mistralai |八月 2025 |131K context |$0.4000/M input |$2.00/M output

131K tokens ⓘ

Baidu: ERNIE 4.5 21B A3B

A sophisticated text-based Mixture-of-Experts (MoE) model featuring 21B total parameters with 3B activated per token, delivering exceptional multimodal understanding and generation through heterogeneous MoE structures and modality-isolated routing. Supporting an...

by baidu |八月 2025 |120K context |$0.0700/M input |$0.2800/M output

120K tokens ⓘ

Baidu: ERNIE 4.5 VL 28B A3B

A powerful multimodal Mixture-of-Experts chat model featuring 28B total parameters with 3B activated per token, delivering exceptional text and vision understanding through its innovative heterogeneous MoE structure with modality-isolated routing....

by baidu |八月 2025 |30K context |$0.1400/M input |$0.5600/M output

Z.ai: GLM 4.5V

GLM-4.5V is a vision-language foundation model for multimodal agent applications. Built on a Mixture-of-Experts (MoE) architecture with 106B parameters and 12B activated parameters, it achieves state-of-the-art results in video understanding,...

by z-ai |八月 2025 |66K context |$0.6000/M input |$1.80/M output

66K tokens ⓘ

AI21: Jamba Large 1.7

Jamba Large 1.7 is the latest model in the Jamba open family, offering improvements in grounding, instruction-following, and overall efficiency. Built on a hybrid SSM-Transformer architecture with a 256K context...

by ai21 |八月 2025 |256K context |$2.00/M input |$8.00/M output

256K tokens ⓘ

开放人工智能: GPT-5 Chat

GPT-5 Chat is designed for advanced, natural, multimodal, and context-aware conversations for enterprise applications.

by openai |八月 2025 |128K context |$1.25/M input |$10.00/M output

128K tokens ⓘ

开放人工智能: GPT-5

GPT-5 is OpenAI’s most advanced model, offering major improvements in reasoning, code quality, and user experience. It is optimized for complex tasks that require step-by-step reasoning, instruction following, and accuracy...

by openai |八月 2025 |400K context |$1.25/M input |$10.00/M output

400K tokens ⓘ

开放人工智能: GPT-5 Mini

GPT-5 Mini is a compact version of GPT-5, designed to handle lighter-weight reasoning tasks. It provides the same instruction-following and safety-tuning benefits as GPT-5, but with reduced latency and cost....

by openai |八月 2025 |400K context |$0.2500/M input |$2.00/M output

400K tokens ⓘ

开放人工智能: GPT-5 Nano

GPT-5-Nano is the smallest and fastest variant in the GPT-5 system, optimized for developer tools, rapid interactions, and ultra-low latency environments. While limited in reasoning depth compared to its larger...

by openai |八月 2025 |400K context |$0.0500/M input |$0.4000/M output

400K tokens ⓘ

开放人工智能: gpt-oss-120b (free)

gpt-oss-120b is an open-weight, 117B-parameter Mixture-of-Experts (MoE) language model from OpenAI designed for high-reasoning, agentic, and general-purpose production use cases. It activates 5.1B parameters per forward pass and is optimized...

by openai |八月 2025 |131K context |Miễn phí input |Miễn phí output

131K tokens ⓘ

开放人工智能: gpt-oss-120b

by openai |八月 2025 |131K context |$0.0390/M input |$0.1900/M output

131K tokens ⓘ

开放人工智能: gpt-oss-20b (free)

gpt-oss-20b is an open-weight 21B parameter model released by OpenAI under the Apache 2.0 license. It uses a Mixture-of-Experts (MoE) architecture with 3.6B active parameters per forward pass, optimized for...

by openai |八月 2025 |131K context |Miễn phí input |Miễn phí output

131K tokens ⓘ

开放人工智能: gpt-oss-20b

by openai |八月 2025 |131K context |$0.0300/M input |$0.1400/M output

131K tokens ⓘ

人择: 近距离工作 4.1

近距离工作 4.1 is an updated version of Anthropic’s flagship model, offering improved performance in coding, reasoning, and agentic tasks. It achieves 74.5% on SWE-bench Verified and shows notable gains...

by anthropic |八月 2025 |200K context |$15.00/M input |$75.00/M output

200K tokens ⓘ

Mistral: Codestral 2508

Mistral's cutting-edge language model for coding released end of July 2025. Codestral specializes in low-latency, high-frequency tasks such as fill-in-the-middle (FIM), code correction and test generation. [Blog Post](https://mistral.ai/news/codestral-25-08)

by mistralai |八月 2025 |256K context |$0.3000/M input |$0.9000/M output

256K tokens ⓘ

Qwen: Qwen3 Coder 30B A3B Instruct

Qwen3-Coder-30B-A3B-Instruct is a 30.5B parameter Mixture-of-Experts (MoE) model with 128 experts (8 active per forward pass), designed for advanced code generation, repository-scale understanding, and agentic tool use. Built on the...

by qwen |Jul 2025 |160K context |$0.0700/M input |$0.2700/M output

160K tokens ⓘ

Qwen: Qwen3 30B A3B Instruct 2507

Qwen3-30B-A3B-Instruct-2507 is a 30.5B-parameter mixture-of-experts language model from Qwen, with 3.3B active parameters per inference. It operates in non-thinking mode and is designed for high-quality instruction following, multilingual understanding, and...

by qwen |Jul 2025 |262K context |$0.0900/M input |$0.3000/M output

262K tokens ⓘ

Z.ai: GLM 4.5

GLM-4.5 is our latest flagship foundation model, purpose-built for agent-based applications. It leverages a Mixture-of-Experts (MoE) architecture and supports a context length of up to 128k tokens. GLM-4.5 delivers significantly...

by z-ai |Jul 2025 |131K context |$0.6000/M input |$2.20/M output

131K tokens ⓘ

Z.ai: GLM 4.5 Air

GLM-4.5-Air is the lightweight variant of our latest flagship model family, also purpose-built for agent-centric applications. Like GLM-4.5, it adopts the Mixture-of-Experts (MoE) architecture but with a more compact parameter...

by z-ai |Jul 2025 |131K context |$0.1300/M input |$0.8500/M output

131K tokens ⓘ

Z.ai: GLM 4.5 Air (free)

by z-ai |Jul 2025 |131K context |Miễn phí input |Miễn phí output

131K tokens ⓘ

Qwen: Qwen3 235B A22B Thinking 2507

Qwen3-235B-A22B-Thinking-2507 is a high-performance, open-weight Mixture-of-Experts (MoE) language model optimized for complex reasoning tasks. It activates 22B of its 235B parameters per forward pass and natively supports up to 262,144...

by qwen |Jul 2025 |131K context |$0.1495/M input |$1.50/M output

131K tokens ⓘ

Z.ai: GLM 4 32乙

GLM 4 32B is a cost-effective foundation language model. It can efficiently perform complex tasks and has significantly enhanced capabilities in tool use, online search, and code-related intelligent tasks. It...

by z-ai |Jul 2025 |128K context |$0.1000/M input |$0.1000/M output

128K tokens ⓘ

Qwen: Qwen3编码器480B A35B (free)

Qwen3-Coder-480B-A35B-Instruct 是专家的混合体 (MoE) Qwen团队开发的代码生成模型. 它针对代理编码任务（例如函数调用）进行了优化, tool use, and long-context reasoning over...

by qwen |Jul 2025 |262K context |Miễn phí input |Miễn phí output

262K tokens ⓘ

Qwen: Qwen3编码器480B A35B

by qwen |Jul 2025 |262K context |$0.2200/M input |$1.00/M output

262K tokens ⓘ

字节跳动: UI-TARS 7B

UI-TARS-1.5 是一种多模式视觉语言代理，针对基于 GUI 的环境进行了优化, 包括桌面界面, 网络浏览器, 移动系统, 和游戏. 由字节跳动打造, it builds upon the UI-TARS framework with reinforcement...

by 字节跳动 |Jul 2025 |128K context |$0.1000/M input |$0.2000/M output

128K tokens ⓘ

谷歌: 双子座 2.5 闪存精简版

双子座 2.5 Flash-Lite is a lightweight reasoning model in the Gemini 2.5 family, optimized for ultra-low latency and cost efficiency. It offers improved throughput, faster token generation, and better performance...

by 谷歌 |Jul 2025 |1M context |$0.1000/M input |$0.4000/M output

1M tokens ⓘ

Qwen: Qwen3 235B A22B 说明书 2507

Qwen3-235B-A22B-Instruct-2507 是一款多语言, 基于 Qwen3-235B 架构的指令调整专家混合语言模型, 每个前向传递有 22B 个活动参数. 它针对通用文本生成进行了优化, 包括以下指令,...

by qwen |Jul 2025 |262K context |$0.0710/M input |$0.1000/M output

262K tokens ⓘ

Switchpoint Router

Switchpoint AI's router instantly analyzes your request and directs it to the optimal AI from an ever-evolving library. As the world of LLMs advances, our router gets smarter, ensuring you...

by switchpoint |Jul 2025 |131K context |$0.8500/M input |$3.40/M output

131K tokens ⓘ

MoonshotAI: Kimi K2 0711

Kimi K2 Instruct is a large-scale Mixture-of-Experts (MoE) language model developed by Moonshot AI, featuring 1 trillion total parameters with 32 billion active per forward pass. It is optimized for...

by moonshotai |Jul 2025 |131K context |$0.5700/M input |$2.30/M output

131K tokens ⓘ

Mistral: Devstral Medium

Devstral Medium is a high-performance code generation and agentic reasoning model developed jointly by Mistral AI and All Hands AI. Positioned as a step up from Devstral Small, it achieves...

by mistralai |Jul 2025 |131K context |$0.4000/M input |$2.00/M output

131K tokens ⓘ

Mistral: Devstral Small 1.1

Devstral Small 1.1 is a 24B parameter open-weight language model for software engineering agents, developed by Mistral AI in collaboration with All Hands AI. Finetuned from Mistral Small 3.1 and...

by mistralai |Jul 2025 |131K context |$0.1000/M input |$0.3000/M output

131K tokens ⓘ

Venice: Uncensored (free)

Venice Uncensored Dolphin Mistral 24B Venice Edition is a fine-tuned variant of Mistral-Small-24B-Instruct-2501, developed by dphn.ai in collaboration with Venice.ai. This model is designed as an “uncensored” instruct-tuned LLM, preserving...

by cognitivecomputations |Jul 2025 |33K context |Miễn phí input |Miễn phí output

xAI: Grok 4

Grok 4 is xAI's latest reasoning model with a 256k context window. It supports parallel tool calling, structured outputs, and both image and text inputs. Note that reasoning is not...

by x-ai |Jul 2025 |256K context |$3.00/M input |$15.00/M output

256K tokens ⓘ

谷歌: 杰玛 3n 2B (free)

Gemma 3n E2B IT 是一种多式联运, Google DeepMind 开发的指令调整模型, 旨在以 2B 的有效参数大小高效运行，同时利用 6B 架构. Based...

by 谷歌 |Jul 2025 |8K context |Miễn phí input |Miễn phí output

腾讯: Hunyuan A13B Instruct

Hunyuan-A13B 是一款 13B 主动参数 Mixture-of-Experts (MoE) 腾讯开发的语言模型, 总参数数为 80B，支持通过 Chain-of-Thought 进行推理. It offers competitive benchmark...

by 腾讯 |Jul 2025 |131K context |$0.1400/M input |$0.5700/M output

131K tokens ⓘ

TNG: DeepSeek R1T2 嵌合体

DeepSeek-TNG-R1T2-Chimera是TNG Tech的第二代Chimera模型. 它是一个 671 B-parameter mixture-of-experts text-generation model assembled from DeepSeek-AI’s R1-0528, R1, and V3-0324 checkpoints with an Assembly-of-Experts merge. The...

by tngtech |Jul 2025 |164K context |$0.3000/M input |$1.10/M output

164K tokens ⓘ

Morph: Morph V3 Large

Morph's high-accuracy apply model for complex code edits. ~4,500 tokens/sec with 98% accuracy for precise code transformations. The model requires the prompt to be in the following format: {instruction} {initial_code}...

by morph |Jul 2025 |262K context |$0.9000/M input |$1.90/M output

262K tokens ⓘ

Morph: Morph V3 Fast

Morph's fastest apply model for code edits. ~10,500 tokens/sec with 96% accuracy for rapid code transformations. The model requires the prompt to be in the following format: {instruction} {initial_code} {edit_snippet}...

by morph |Jul 2025 |82K context |$0.8000/M input |$1.20/M output

82K tokens ⓘ

Baidu: ERNIE 4.5 VL 424B A47B

ERNIE-4.5-VL-424B-A47B is a multimodal Mixture-of-Experts (MoE) model from Baidu’s ERNIE 4.5 series, featuring 424B total parameters with 47B active per token. It is trained jointly on text and image data...

by baidu |君 2025 |123K context |$0.4200/M input |$1.25/M output

123K tokens ⓘ

Baidu: ERNIE 4.5 300B A47B

ERNIE-4.5-300B-A47B is a 300B parameter Mixture-of-Experts (MoE) language model developed by Baidu as part of the ERNIE 4.5 series. It activates 47B parameters per token and supports text generation in...

by baidu |君 2025 |123K context |$0.2800/M input |$1.10/M output

123K tokens ⓘ

Inception: Mercury

Mercury is the first diffusion large language model (dLLM). Applying a breakthrough discrete diffusion approach, the model runs 5-10x faster than even speed optimized models like GPT-4.1 Nano and Claude...

by inception |君 2025 |128K context |$0.2500/M input |$0.7500/M output

128K tokens ⓘ

Mistral: Mistral Small 3.2 24乙

Mistral-Small-3.2-24B-Instruct-2506 is an updated 24B parameter model from Mistral optimized for instruction following, repetition reduction, and improved function calling. Compared to the 3.1 release, version 3.2 significantly improves accuracy on...

by mistralai |君 2025 |128K context |$0.0750/M input |$0.2000/M output

128K tokens ⓘ

MiniMax: MiniMax M1

MiniMax-M1 is a large-scale, open-weight reasoning model designed for extended context and high-efficiency inference. It leverages a hybrid Mixture-of-Experts (MoE) architecture paired with a custom "lightning attention" mechanism, allowing it...

by minimax |君 2025 |1M context |$0.4000/M input |$2.20/M output

1M tokens ⓘ

谷歌: 双子座 2.5 Flash

双子座 2.5 Flash is Google's state-of-the-art workhorse model, specifically designed for advanced reasoning, coding, 数学, and scientific tasks. It includes built-in "thinking" capabilities, enabling it to provide responses with greater...

by 谷歌 |君 2025 |1M context |$0.3000/M input |$2.50/M output

1M tokens ⓘ

谷歌: 双子座 2.5 Pro

双子座 2.5 Pro is Google’s state-of-the-art AI model designed for advanced reasoning, coding, 数学, and scientific tasks. It employs “thinking” capabilities, enabling it to reason through responses with enhanced accuracy...

by 谷歌 |君 2025 |1M context |$1.25/M input |$10.00/M output

1M tokens ⓘ

AI Models

帐户

🔑 Lấy lại mật khẩu