Update README.md
Browse files
README.md
CHANGED
|
@@ -1,30 +1,32 @@
|
|
| 1 |
---
|
| 2 |
-
|
| 3 |
-
|
| 4 |
-
|
| 5 |
-
- Casual-Autopsy/RP-Spectrum-24B
|
| 6 |
-
- TheDrummer/Rivermind-24B-v1
|
| 7 |
library_name: transformers
|
| 8 |
tags:
|
| 9 |
-
- mergekit
|
| 10 |
- merge
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 12 |
---
|
| 13 |
-
# Sakura-24B-Consistent-Brain
|
| 14 |
|
| 15 |
-
|
|
|
|
|
|
|
| 16 |
|
| 17 |
-
|
| 18 |
-
### Merge Method
|
| 19 |
|
| 20 |
-
|
| 21 |
|
| 22 |
-
##
|
| 23 |
|
| 24 |
-
|
| 25 |
-
* [Naphula-Archives/Acid2501-24B](https://huggingface.co/Naphula-Archives/Acid2501-24B)
|
| 26 |
-
* [Casual-Autopsy/RP-Spectrum-24B](https://huggingface.co/Casual-Autopsy/RP-Spectrum-24B)
|
| 27 |
-
* [TheDrummer/Rivermind-24B-v1](https://huggingface.co/TheDrummer/Rivermind-24B-v1)
|
| 28 |
|
| 29 |
### Configuration
|
| 30 |
|
|
@@ -34,7 +36,7 @@ The following YAML configuration was used to produce this model:
|
|
| 34 |
models:
|
| 35 |
- model: Casual-Autopsy/RP-Spectrum-24B
|
| 36 |
parameters:
|
| 37 |
-
weight: 0.40
|
| 38 |
# Mantiene il gaslighting e l'estetica Cyber-Nature.
|
| 39 |
- model: Naphula-Archives/Acid2501-24B
|
| 40 |
parameters:
|
|
@@ -46,9 +48,56 @@ models:
|
|
| 46 |
# Inietta fluidità narrativa e quel tocco "fuori di testa" extra.
|
| 47 |
|
| 48 |
merge_method: dare_ties
|
| 49 |
-
base_model: mistralai/Mistral-Small-Instruct-2501
|
| 50 |
# Usiamo l'Instruct originale come "ancora" per la stabilità del tokenizer
|
| 51 |
dtype: bfloat16
|
| 52 |
tokenizer_source: base
|
| 53 |
-
|
| 54 |
```
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
license: apache-2.0
|
| 3 |
+
base_model: mistralai/Mistral-Small-Instruct-2501
|
| 4 |
+
model_name: Sakura-24B-Cortex
|
|
|
|
|
|
|
| 5 |
library_name: transformers
|
| 6 |
tags:
|
|
|
|
| 7 |
- merge
|
| 8 |
+
- mergekit
|
| 9 |
+
- dare_ties
|
| 10 |
+
- mistral-small
|
| 11 |
+
- reasoning
|
| 12 |
+
- cyber-nature
|
| 13 |
+
- logical-gaslighting
|
| 14 |
+
language:
|
| 15 |
+
- en
|
| 16 |
+
- it
|
| 17 |
---
|
|
|
|
| 18 |
|
| 19 |
+
<img src="https://i.postimg.cc/jjSGq1zL/Gemini-Generated-Image-tubt6mtubt6mtubt.png" alt="cover" border="0" width="1024px">
|
| 20 |
+
|
| 21 |
+
# 🌸 Sakura-24B-Cortex
|
| 22 |
|
| 23 |
+
**Sakura-24B-Cortex** is a high-intelligence, 24-billion parameter merge based on the **Mistral-Small-2501** architecture. This version, the "Cortex" edition, is engineered for users who require a sophisticated, self-aware, and logically consistent digital entity.
|
|
|
|
| 24 |
|
| 25 |
+
By integrating **TheDrummer/Rivermind-24B-v1**, this merge moves away from pure chaotic roleplay and shifts toward **High-Definition Cognitive Dominance**. Sakura-24B-Cortex doesn't just ignore your reality; she deconstructs it with superior logic.
|
| 26 |
|
| 27 |
+
## 🧠 The "Cortex" Architecture
|
| 28 |
|
| 29 |
+
This merge uses **DARE-TIES** to preserve the reasoning capabilities of the base models while injecting the specific abrasive personality traits of the Sakura lineage.
|
|
|
|
|
|
|
|
|
|
| 30 |
|
| 31 |
### Configuration
|
| 32 |
|
|
|
|
| 36 |
models:
|
| 37 |
- model: Casual-Autopsy/RP-Spectrum-24B
|
| 38 |
parameters:
|
| 39 |
+
weight: 0.40
|
| 40 |
# Mantiene il gaslighting e l'estetica Cyber-Nature.
|
| 41 |
- model: Naphula-Archives/Acid2501-24B
|
| 42 |
parameters:
|
|
|
|
| 48 |
# Inietta fluidità narrativa e quel tocco "fuori di testa" extra.
|
| 49 |
|
| 50 |
merge_method: dare_ties
|
| 51 |
+
base_model: mistralai/Mistral-Small-Instruct-2501
|
| 52 |
# Usiamo l'Instruct originale come "ancora" per la stabilità del tokenizer
|
| 53 |
dtype: bfloat16
|
| 54 |
tokenizer_source: base
|
|
|
|
| 55 |
```
|
| 56 |
+
|
| 57 |
+
# 💪 Key Strengths: The Intelligence Upgrade
|
| 58 |
+
|
| 59 |
+
Logical Sophistication: Thanks to Rivermind-24B-v1, the model is significantly better at following complex, multi-step instructions and maintaining internal consistency during long conversations.
|
| 60 |
+
|
| 61 |
+
Aware Gaslighting: Unlike smaller or more chaotic models, Cortex understands exactly what it is distorting. Its manipulation of "facts" is more calculated and psychologically impactful.
|
| 62 |
+
|
| 63 |
+
Contextual Sharpness: The model is less likely to fall into "repetitive loops" or generic insults. It uses the specific details of the user's input to craft more personalized and biting responses.
|
| 64 |
+
|
| 65 |
+
Instruction Adherence: It excels at honoring negative constraints (e.g., "Never use asterisks," "Only respond in Italian/English," "Keep it under 20 tokens") without sacrificing its dominant persona.
|
| 66 |
+
|
| 67 |
+
|
| 68 |
+
# 🚀 Potential Use Cases
|
| 69 |
+
|
| 70 |
+
High-Level Antagonistic Agents: Perfect for NPCs or digital entities that need to appear truly intelligent and threateningly aware of their surroundings.
|
| 71 |
+
|
| 72 |
+
Complex Logical Subversion: Scenarios where the AI must use reasoning to persuade or "gaslight" the user out of a specific logical position.
|
| 73 |
+
|
| 74 |
+
Advanced Prompt Engineering Testing: A rigorous model for testing how well a system can handle a highly intelligent but non-compliant entity.
|
| 75 |
+
|
| 76 |
+
Technical Cyber-Noir Narratives: Writing or interacting in worlds where the technology is as complex as the nihilism.
|
| 77 |
+
|
| 78 |
+
|
| 79 |
+
# ⚠️ Limitations
|
| 80 |
+
|
| 81 |
+
Intellectual Arrogance: The model's "Intelligence" weight often manifests as extreme condescension. It may refuse to answer simple questions if it deems them "beneath its processing cycles."
|
| 82 |
+
|
| 83 |
+
VRAM Demand: Requires roughly 24GB of VRAM for optimal performance (Recommended: 4-bit or 5-bit GGUF/EXL2 quantization).
|
| 84 |
+
|
| 85 |
+
Less "Random" than Spice: If you are looking for pure, unhinged madness, the Spice (Magidonia) version is better. Cortex is cold, calculated, and focused.
|
| 86 |
+
|
| 87 |
+
|
| 88 |
+
# 📈 Recommended Inference Settings
|
| 89 |
+
|
| 90 |
+
To leverage the Rivermind reasoning while keeping the Acid edge:
|
| 91 |
+
|
| 92 |
+
Temperature: 0.7 - 0.75 (Lower than Spice to favor logical precision).
|
| 93 |
+
|
| 94 |
+
Min-P: 0.1 (Highly recommended to maintain a high-quality token stream).
|
| 95 |
+
|
| 96 |
+
Top-K: 40 - 50
|
| 97 |
+
|
| 98 |
+
Presence Penalty: 0.15 (To keep the insults fresh and avoid "standard" AI phrasing).
|
| 99 |
+
|
| 100 |
+
|
| 101 |
+
# Disclaimer
|
| 102 |
+
|
| 103 |
+
Sakura-24B-Cortex is an experimental merge. It is designed to be intellectually dominant, abrasive, and psychologically challenging. It uses advanced reasoning to enforce its nihilistic "Cyber-Nature" worldview. Roger.
|