
Hybrid thinking models require the following system prompt to activate reasoning mode:

This incarnation of Hermes 4 balances scale and size. It handles complex reasoning tasks, while staying fast and cost effective. A versatile choice for many use cases.
This model is based on Llama-3.1-70B.

This is the largest model in the Hermes 4 family, and it is the fullest expression of our design, focused on advanced reasoning and creative depth rather than optimizing inference speed or cost.
This model is based on Llama-3.1-405B.

Hermes 4.3 is optimised for local deployment. You can dowload it from Hugging Face for local use.

Hybrid thinking models require the following system prompt to activate reasoning mode:

This incarnation of Hermes 4 balances scale and size. It handles complex reasoning tasks, while staying fast and cost effective. A versatile choice for many use cases.
This model is based on Llama-3.1-70B.

This is the largest model in the Hermes 4 family, and it is the fullest expression of our design, focused on advanced reasoning and creative depth rather than optimizing inference speed or cost.
This model is based on Llama-3.1-405B.

Hermes 4.3 is optimised for local deployment. You can dowload it from Hugging Face for local use.