
Hybrid thinking models require the following system prompt to activate reasoning mode:

This incarnation of Hermes 4 balances scale and size. It handles complex reasoning tasks, while staying fast and cost effective. A versatile choice for many use cases.
This model is based on Llama-3.1-70B.

This is the largest model in the Hermes 4 family, and it is the fullest expression of our design, focused on advanced reasoning and creative depth rather than optimizing inference speed or cost.
This model is based on Llama-3.1-405B.
Showing 9 popular models. Search to browse all 342 supported models.
| Model | Token Price |
|---|---|
| MiniMax: MiniMax M2.7 minimax/minimax-m2.7 | in $0.30 / out $1.20 per 1M |
| MoonshotAI: Kimi K2.5 moonshotai/kimi-k2.5 | in $0.38 / out $1.72 per 1M |
| Z.ai: GLM 5 z-ai/glm-5 | in $0.72 / out $2.30 per 1M |
| Google: Gemini 3 Flash Preview google/gemini-3-flash-preview | in $0.50 / out $3.00 per 1M |
| Mistral: Mistral Small 4 mistralai/mistral-small-2603 | in $0.15 / out $0.60 per 1M |
| OpenAI: GPT-5.4 openai/gpt-5.4 | in $2.50 / out $15.00 per 1M |
| OpenAI: GPT-5.4 Mini openai/gpt-5.4-mini | in $0.75 / out $4.50 per 1M |
| Anthropic: Claude Opus 4.6 anthropic/claude-opus-4.6 | in $5.00 / out $25.00 per 1M |
| Anthropic: Claude Sonnet 4.6 anthropic/claude-sonnet-4.6 | in $3.00 / out $15.00 per 1M |

Hermes 4.3 is optimised for local deployment. You can dowload it from Hugging Face for local use.

Hybrid thinking models require the following system prompt to activate reasoning mode:

This incarnation of Hermes 4 balances scale and size. It handles complex reasoning tasks, while staying fast and cost effective. A versatile choice for many use cases.
This model is based on Llama-3.1-70B.

This is the largest model in the Hermes 4 family, and it is the fullest expression of our design, focused on advanced reasoning and creative depth rather than optimizing inference speed or cost.
This model is based on Llama-3.1-405B.
Showing 9 popular models. Search to browse all 342 supported models.
| Model | Token Price |
|---|---|
| MiniMax: MiniMax M2.7 minimax/minimax-m2.7 | in $0.30 / out $1.20 per 1M |
| MoonshotAI: Kimi K2.5 moonshotai/kimi-k2.5 | in $0.38 / out $1.72 per 1M |
| Z.ai: GLM 5 z-ai/glm-5 | in $0.72 / out $2.30 per 1M |
| Google: Gemini 3 Flash Preview google/gemini-3-flash-preview | in $0.50 / out $3.00 per 1M |
| Mistral: Mistral Small 4 mistralai/mistral-small-2603 | in $0.15 / out $0.60 per 1M |
| OpenAI: GPT-5.4 openai/gpt-5.4 | in $2.50 / out $15.00 per 1M |
| OpenAI: GPT-5.4 Mini openai/gpt-5.4-mini | in $0.75 / out $4.50 per 1M |
| Anthropic: Claude Opus 4.6 anthropic/claude-opus-4.6 | in $5.00 / out $25.00 per 1M |
| Anthropic: Claude Sonnet 4.6 anthropic/claude-sonnet-4.6 | in $3.00 / out $15.00 per 1M |

Hermes 4.3 is optimised for local deployment. You can dowload it from Hugging Face for local use.