DJLougen commited on
Commit
1491eda
Β·
verified Β·
1 Parent(s): 2d59431

Full model card, rename files to Harmonic-Hermes-9B-*

Browse files
.gitattributes CHANGED
@@ -35,3 +35,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  Qwen3.5-9B-Harmonic.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
37
  Qwen3.5-9B-Harmonic.BF16-mmproj.gguf filter=lfs diff=lfs merge=lfs -text
 
 
 
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
  Qwen3.5-9B-Harmonic.Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
37
  Qwen3.5-9B-Harmonic.BF16-mmproj.gguf filter=lfs diff=lfs merge=lfs -text
38
+ Harmonic-Hermes-9B-Q8_0.gguf filter=lfs diff=lfs merge=lfs -text
39
+ Harmonic-Hermes-9B-BF16-mmproj.gguf filter=lfs diff=lfs merge=lfs -text
Qwen3.5-9B-Harmonic.BF16-mmproj.gguf β†’ Harmonic-Hermes-9B-BF16-mmproj.gguf RENAMED
File without changes
Qwen3.5-9B-Harmonic.Q8_0.gguf β†’ Harmonic-Hermes-9B-Q8_0.gguf RENAMED
File without changes
README.md CHANGED
@@ -1,21 +1,124 @@
1
  ---
 
 
 
2
  tags:
3
  - gguf
 
 
 
 
 
 
 
4
  - llama.cpp
5
  - unsloth
6
- - vision-language-model
 
7
  ---
8
 
9
- # Harmonic-Hermes-9B-GGUF : GGUF
 
 
 
 
10
 
11
- This model was finetuned and converted to GGUF format using [Unsloth](https://github.com/unslothai/unsloth).
12
 
13
- **Example usage**:
14
- - For text only LLMs: `llama-cli -hf DJLougen/Harmonic-Hermes-9B-GGUF --jinja`
15
- - For multimodal models: `llama-mtmd-cli -hf DJLougen/Harmonic-Hermes-9B-GGUF --jinja`
16
 
17
- ## Available Model files:
18
- - `Qwen3.5-9B-Harmonic.Q8_0.gguf`
19
- - `Qwen3.5-9B-Harmonic.BF16-mmproj.gguf`
20
- This was trained 2x faster with [Unsloth](https://github.com/unslothai/unsloth)
21
- [<img src="https://raw.githubusercontent.com/unslothai/unsloth/main/images/unsloth%20made%20with%20love.png" width="200"/>](https://github.com/unslothai/unsloth)
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ language:
3
+ - en
4
+ license: apache-2.0
5
  tags:
6
  - gguf
7
+ - qwen3.5
8
+ - reasoning
9
+ - chain-of-thought
10
+ - self-correction
11
+ - tool-calling
12
+ - agent
13
+ - hermes
14
  - llama.cpp
15
  - unsloth
16
+ - conversational
17
+ base_model: DJLougen/Harmonic-Hermes-9B
18
  ---
19
 
20
+ > ## β˜• Support This Work
21
+ >
22
+ > I'm a PhD student in visual neuroscience at the University of Toronto who also happens to spend way too much time fine-tuning, merging, and quantizing open-weight models on rented H100s and a local DGX Spark. It's a hobby that got out of hand. If my uploads have been useful to you, consider buying a PhD student a coffee. It goes a long way toward keeping these experiments running.
23
+ >
24
+ > **[β˜• ko-fi.com/djlougen](https://ko-fi.com/djlougen)**
25
 
26
+ # Harmonic-Hermes-9B-GGUF
27
 
28
+ GGUF quantizations of [Harmonic-Hermes-9B](https://huggingface.co/DJLougen/Harmonic-Hermes-9B) for local inference with llama.cpp, Ollama, LM Studio, and other GGUF-compatible runtimes.
 
 
29
 
30
+ Harmonic-Hermes-9B is the **Stage 2 agentic fine-tune** of [Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B) β€” a dedicated tool-calling and agent model built on top of a strong reasoning backbone.
31
+
32
+ Where Harmonic-9B teaches the model *how to think*, Harmonic-Hermes-9B teaches it *how to act* β€” structured tool use, multi-turn agent workflows, and function calling, all grounded in the reasoning depth from Stage 1.
33
+
34
+ > **Stage 1** β€” [Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B): Heavy reasoning fine-tune on privately generated, structurally validated data. Every row passes strict quality gates. The thinking backbone.
35
+ >
36
+ > **Stage 2** (this model): Agentic fine-tune on tool-calling and agent interaction data. Inherits Stage 1's reasoning depth and adds structured action capabilities.
37
+
38
+ ## Available Quantizations
39
+
40
+ | File | Quant | Size | Use Case |
41
+ |---|---|---|---|
42
+ | `Harmonic-Hermes-9B-Q8_0.gguf` | Q8_0 | ~9.5 GB | Near-lossless, 16GB VRAM |
43
+
44
+ More quantizations coming soon.
45
+
46
+ ### Vision (Multimodal)
47
+
48
+ This model includes `Harmonic-Hermes-9B-BF16-mmproj.gguf` β€” the vision projector for multimodal inference. Use with llama.cpp's `--mmproj` flag for image understanding tasks.
49
+
50
+ ## What This Model Does
51
+
52
+ - **Tool calling / function calling** β€” structured JSON tool use in the Hermes agent format
53
+ - **Multi-turn agent workflows** β€” maintains coherent state across extended tool-use conversations
54
+ - **Reasoning-grounded decisions** β€” inherits Harmonic-9B's self-correction, verification, and exploration before committing to actions
55
+
56
+ ## Training Approach
57
+
58
+ Harmonic-Hermes-9B is a Stage 2 fine-tune of [Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B), trained on curated agent interaction and tool-calling data.
59
+
60
+ The key insight: most agent models are fine-tuned directly from base models or generic instruct tunes. They learn tool-call formatting but not *when* or *why* to use tools. By starting from a model that already reasons deeply (Stage 1), the agent behaviors are grounded in genuine multi-step thinking rather than pattern-matched tool invocations.
61
+
62
+ ## Usage
63
+
64
+ ### Ollama
65
+
66
+ ```
67
+ ollama run DJLougen/Harmonic-Hermes-9B-GGUF
68
+ ```
69
+
70
+ ### llama.cpp
71
+
72
+ ```bash
73
+ ./llama-cli -m Harmonic-Hermes-9B-Q8_0.gguf -p "Use the available tools to..." -n 4096
74
+ ```
75
+
76
+ ### LM Studio
77
+
78
+ Download any quantization and load in LM Studio. The model follows standard ChatML formatting.
79
+
80
+ ### Reasoning + Tool Use
81
+
82
+ The model uses `<think>` blocks for reasoning before acting:
83
+
84
+ ```
85
+ <think>
86
+ The user wants to check the weather in Toronto. I have a get_weather tool available.
87
+ Let me call it with the right parameters...
88
+ </think>
89
+
90
+ <tool_call>
91
+ {"name": "get_weather", "arguments": {"location": "Toronto, Canada"}}
92
+ </tool_call>
93
+ ```
94
+
95
+ ## Intended Use
96
+
97
+ - Agentic workflows with tool calling and function execution
98
+ - Multi-turn assistant interactions requiring structured reasoning
99
+ - Local inference as an always-on agent backbone
100
+ - Research into reasoning-grounded agent behavior
101
+
102
+ ## Limitations
103
+
104
+ - 9B parameter model β€” not suitable for tasks requiring extensive world knowledge
105
+ - Agent capabilities are shaped by the training data distribution
106
+ - Benchmark evaluation is ongoing
107
+
108
+ ## Architecture
109
+
110
+ - **Base**: [Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B) (Stage 1 reasoning fine-tune of Qwen 3.5 9B)
111
+ - **Parameters**: 9.65B
112
+ - **Training**: LoRA fine-tuning, merged into base weights
113
+ - **Precision**: BF16
114
+ - **Context**: 8192 tokens
115
+
116
+ ## License
117
+
118
+ Apache 2.0 β€” same as the base model. Fully commercial use permitted.
119
+
120
+ ## Links
121
+
122
+ - Stage 2 full weights: [DJLougen/Harmonic-Hermes-9B](https://huggingface.co/DJLougen/Harmonic-Hermes-9B)
123
+ - Stage 1 reasoning backbone: [DJLougen/Harmonic-9B](https://huggingface.co/DJLougen/Harmonic-9B)
124
+ - Stage 1 GGUF quantizations: [DJLougen/Harmonic-9B-GGUF](https://huggingface.co/DJLougen/Harmonic-9B-GGUF)