Models_

Hermes-3-Llama-3.1-70B

Hermes-3-Llama-3.1-70B

hugging face logoModel Card
$0.90/1M tokens
32k tokens ctx

This model is suitable for applications where high accuracy and comprehensive understanding are critical, such as advanced research tasks, complex content generation, or sophisticated dialogue systems.

It is designed for users who prioritize performance and are working with tasks that demand a large, capable model.

DeepHermes-3-Llama-3-8B-Preview

DeepHermes-3-Llama-3-8B-Preview

hugging face logoModel Card
$0.70/1M tokens
64k tokens ctx

This model is a good choice for developers looking to experiment with reasoning, without excessive overhead.

DeepHermes models unify Reasoning (long chains of thought that improve answer accuracy) and standard LLM response modes into one model. This 8 billion parameter variant is designed to provide a strong performance profile and generous context window.

DeepHermes-3-Mistral-24B-Preview

DeepHermes-3-Mistral-24B-Preview

hugging face logoModel Card
$0.85/1M tokens
32k tokens ctx

This model is a strong all-rounder, offering both Reasoning and standard LLM responses with high accuracy and comprehensive understanding. Use this model if you want high quality outputs with the option of using Reasoning when needed.

It is based on the Mistral architecture, which has demonstrated competitive performance in a broad range of tasks.

Hermes-3-Llama-3.1-405B

Hermes-3-Llama-3.1-405B

hugging face logoModel Card
$1.80/1M tokens
32k tokens ctx

This is the largest model in the Hermes 3 family, offering powerful emotional nuance, and deep contextual understanding.

Hermes 3 405b is designed for users who want to explore the cutting edge of LLM intelligence, consciousness simulation, and profound interaction across creative, technical, and conversational tasks.