Showing 11 recommended models for use in Hermes Agent. Search to browse all 329 supported models.
| Model | Token Price |
|---|---|
| MiniMax: MiniMax M2.7 minimax/minimax-m2.7 | in $0.28 / out $1.20 per 1M |
| Google: Gemini 3 Flash Preview google/gemini-3-flash-preview | in $0.50 / out $3.00 per 1M |
| OpenAI: GPT-5.4 openai/gpt-5.4 | in $2.50 / out $15.00 per 1M |
| OpenAI: GPT-5.4 Mini openai/gpt-5.4-mini | in $0.75 / out $4.50 per 1M |
| Anthropic: Claude Opus 4.7 anthropic/claude-opus-4.7 | in $5.00 / out $25.00 per 1M |
| Anthropic: Claude Sonnet 4.6 anthropic/claude-sonnet-4.6 | in $3.00 / out $15.00 per 1M |
| MoonshotAI: Kimi K2.6 moonshotai/kimi-k2.6 | in $0.74 / out $3.50 per 1M |
| Z.ai: GLM 5.1 z-ai/glm-5.1 | in $1.05 / out $3.50 per 1M |
| Xiaomi: MiMo-V2.5 xiaomi/mimo-v2.5 | in $0.40 / out $2.00 per 1M |
| Xiaomi: MiMo-V2.5-Pro xiaomi/mimo-v2.5-pro | in $1.00 / out $3.00 per 1M |
| StepFun: Step 3.5 Flash stepfun/step-3.5-flash | $0.00/1M |

Hybrid thinking models require the following system prompt to activate reasoning mode:

This incarnation of Hermes 4 balances scale and size. It handles complex reasoning tasks, while staying fast and cost effective. A versatile choice for many use cases.
This model is based on Llama-3.1-70B.


Hermes 4.3 is optimised for local deployment. You can dowload it from Hugging Face for local use.
Showing 11 recommended models for use in Hermes Agent. Search to browse all 329 supported models.
| Model | Token Price |
|---|---|
| MiniMax: MiniMax M2.7 minimax/minimax-m2.7 | in $0.28 / out $1.20 per 1M |
| Google: Gemini 3 Flash Preview google/gemini-3-flash-preview | in $0.50 / out $3.00 per 1M |
| OpenAI: GPT-5.4 openai/gpt-5.4 | in $2.50 / out $15.00 per 1M |
| OpenAI: GPT-5.4 Mini openai/gpt-5.4-mini | in $0.75 / out $4.50 per 1M |
| Anthropic: Claude Opus 4.7 anthropic/claude-opus-4.7 | in $5.00 / out $25.00 per 1M |
| Anthropic: Claude Sonnet 4.6 anthropic/claude-sonnet-4.6 | in $3.00 / out $15.00 per 1M |
| MoonshotAI: Kimi K2.6 moonshotai/kimi-k2.6 | in $0.74 / out $3.50 per 1M |
| Z.ai: GLM 5.1 z-ai/glm-5.1 | in $1.05 / out $3.50 per 1M |
| Xiaomi: MiMo-V2.5 xiaomi/mimo-v2.5 | in $0.40 / out $2.00 per 1M |
| Xiaomi: MiMo-V2.5-Pro xiaomi/mimo-v2.5-pro | in $1.00 / out $3.00 per 1M |
| StepFun: Step 3.5 Flash stepfun/step-3.5-flash | $0.00/1M |

Hybrid thinking models require the following system prompt to activate reasoning mode:

This incarnation of Hermes 4 balances scale and size. It handles complex reasoning tasks, while staying fast and cost effective. A versatile choice for many use cases.
This model is based on Llama-3.1-70B.


Hermes 4.3 is optimised for local deployment. You can dowload it from Hugging Face for local use.
This is the largest model in the Hermes 4 family, and it is the fullest expression of our design, focused on advanced reasoning and creative depth rather than optimizing inference speed or cost.
This model is based on Llama-3.1-405B.
This is the largest model in the Hermes 4 family, and it is the fullest expression of our design, focused on advanced reasoning and creative depth rather than optimizing inference speed or cost.
This model is based on Llama-3.1-405B.