
Hybrid thinking models require the following system prompt to activate reasoning mode:

This incarnation of Hermes 4 balances scale and size. It handles complex reasoning tasks, while staying fast and cost effective. A versatile choice for many use cases.
This model is based on Llama-3.1-70B.

This is the largest model in the Hermes 4 family, and it is the fullest expression of our design, focused on advanced reasoning and creative depth rather than optimizing inference speed or cost.
This model is based on Llama-3.1-405B.
| Model | Token Price |
|---|---|
| claude-haiku-4-5-20251001 | in $1.10 / out $5.50 per 1M |
| claude-opus-4-6 | in $5.50 / out $27.50 per 1M |
| claude-sonnet-4-6 | in $3.30 / out $16.50 per 1M |
| deepseek-r1 | in $0.77 / out $2.75 per 1M |
| deepseek-v3.2 | in $0.28 / out $0.45 per 1M |
| deepseek/deepseek-v3.2 | in $0.28 / out $0.45 per 1M |
| gemini-3-flash | in $0.55 / out $3.30 per 1M |
| gemini-3.0-pro-preview |

Hermes 4.3 is optimised for local deployment. You can dowload it from Hugging Face for local use.

Hybrid thinking models require the following system prompt to activate reasoning mode:

This incarnation of Hermes 4 balances scale and size. It handles complex reasoning tasks, while staying fast and cost effective. A versatile choice for many use cases.
This model is based on Llama-3.1-70B.

This is the largest model in the Hermes 4 family, and it is the fullest expression of our design, focused on advanced reasoning and creative depth rather than optimizing inference speed or cost.
This model is based on Llama-3.1-405B.
| Model | Token Price |
|---|---|
| claude-haiku-4-5-20251001 | in $1.10 / out $5.50 per 1M |
| claude-opus-4-6 | in $5.50 / out $27.50 per 1M |
| claude-sonnet-4-6 | in $3.30 / out $16.50 per 1M |
| deepseek-r1 | in $0.77 / out $2.75 per 1M |
| deepseek-v3.2 | in $0.28 / out $0.45 per 1M |
| deepseek/deepseek-v3.2 | in $0.28 / out $0.45 per 1M |
| gemini-3-flash | in $0.55 / out $3.30 per 1M |
| gemini-3.0-pro-preview |

Hermes 4.3 is optimised for local deployment. You can dowload it from Hugging Face for local use.
| in $4.40 / out $19.80 per 1M |
| glm-4.6 | in $0.60 / out $2.00 per 1M |
| glm5 | in $1.00 / out $3.30 per 1M |
| gpt-5-mini | in $0.25 / out $2.00 per 1M |
| gpt-5-nano | in $0.06 / out $0.44 per 1M |
| gpt-5.1 | in $1.38 / out $11.00 per 1M |
| gpt-5.2 | in $1.93 / out $15.40 per 1M |
| gpt-5.3-codex | in $1.93 / out $15.40 per 1M |
| kimi-k2-thinking | in $0.55 / out $2.50 per 1M |
| kimi-k2.5 | in $0.50 / out $2.40 per 1M |
| minimax-m2.5 | in $0.33 / out $1.30 per 1M |
| qwen3.5 | in $0.44 / out $2.62 per 1M |
| in $4.40 / out $19.80 per 1M |
| glm-4.6 | in $0.60 / out $2.00 per 1M |
| glm5 | in $1.00 / out $3.30 per 1M |
| gpt-5-mini | in $0.25 / out $2.00 per 1M |
| gpt-5-nano | in $0.06 / out $0.44 per 1M |
| gpt-5.1 | in $1.38 / out $11.00 per 1M |
| gpt-5.2 | in $1.93 / out $15.40 per 1M |
| gpt-5.3-codex | in $1.93 / out $15.40 per 1M |
| kimi-k2-thinking | in $0.55 / out $2.50 per 1M |
| kimi-k2.5 | in $0.50 / out $2.40 per 1M |
| minimax-m2.5 | in $0.33 / out $1.30 per 1M |
| qwen3.5 | in $0.44 / out $2.62 per 1M |