ConicCat
/

MistralSmallV3R

Model card Files Files and versions

ConicCat commited on Mar 18, 2025

Commit

7997e29

·

verified ·

1 Parent(s): b384207

Update README.md

Files changed (1) hide show

README.md +57 -2

README.md CHANGED Viewed

@@ -1,6 +1,61 @@
 ---
 base_model:
 - arcee-ai/Arcee-Blitz
-library_name: transformers
-license: apache-2.0
 ---

 ---
+license: apache-2.0
 base_model:
+- mistralai/Mistral-Small-24B-Base-2501
 - arcee-ai/Arcee-Blitz
+datasets:
+- open-thoughts/OpenThoughts-114k
+- Undi95/R1-RP-ShareGPT3
 ---
+<p align="left">
+  <img width="45%" src="V3.png">
+</p>
+# Introduction
+This model is my feeble attempt at reproducing the R1 at home experience, following the (un)official Deepseek formula: \
+`Deepseek V3 + Unhinged Reasoning = R1` \
+Substituting out some variables we get: \
+`Arcee Blitz V3 Distill + Unhinged R1 Reasoning Traces = MistralSmallV3R` \
+\
+I'm quite happy with how this model turned out; It's definitely weaker than QwQ for coding and long reasoning problems but a lot more personable and well rounded overall.
+The faster speed and VRAM savings are quite nice as well for how long these models can reason and how much context they can burn through.
+Stability is satisfactory too and the model appears to function well even without super aggressive sampling like the base instruct finetune.
+## Use Cases
+Just like the base model, MistralSmallV3R is a solid `all rounder` and should be able to effectively generalize its reasoning capabilities,
+as care was taken to train this model on
+a wide variety of reasoning tasks, including math, coding, roleplay, etc.
+In particular, MistralSmallV3R appears very good at `contextual and emotional reasoning`
+compared to other reasoning models that I have tested so far. MistralSmallV3R also tends to spend good portion of its thought considering what the user wants,
+hopefully giving this model higher `resistance to poor prompting`.
+Compared to other mid-range reasoning models such as QwQ it
+also has markedly `better prose quality` without sounding like too much of a try hard like some of the Gutenberg models.
+This model has also inherited a notable negative bias from the R1 synthetic data it was trained on. \
+Supports up to 32k context. \
+Should remain usable with only `12GB of VRAM` when quanted to IQ3_M or IQ3_S.
+## Recommended Settings:
+- Template: Mistral Tekken V7
+- Temp: 0.1-0.7, depending on use case.
+- TopP: 0.95
+- MinP: 0.05
+- Rep Pen: Not needed, but if you do use it keep the range low!
+- Do not use DRY!
+- System prompt: Trained on Sillytavern defaults. Add `Reason out how you will respond between the <think> and </think> tags.` to the end of your system prompt. Optionally you may wish to prompt the model to have positive bias.
+- Add the `<think>` tag to the beginning of the response.
+- Enable reasoning auto-parse, and remove the newlines from the reasoning formatting prefix and suffix.
+- Low temps allow for longer reasoning
+- Q4_K_M is reccomended as the smallest effectively lossless quant.
+## Benchmarks
+TODO
+## Dataset
+Was trained on an interleaved dataset of 40% LimaRP-R1 and 60% Openthoughts.
+## Thanks
+Major thanks to Arcee AI for their excellent Arcee Blitz finetune and general mergekit nonsense. \
+Undi95 for proving the viability of a stable reasoning finetune of MS3. \
+OpenThoughts for their dataset of the same name. \
+Mistral, for their continuing dedication to the open source community.