Safetensors
mistral
ConicCat commited on
Commit
7997e29
·
verified ·
1 Parent(s): b384207

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +57 -2
README.md CHANGED
@@ -1,6 +1,61 @@
1
  ---
 
2
  base_model:
 
3
  - arcee-ai/Arcee-Blitz
4
- library_name: transformers
5
- license: apache-2.0
 
6
  ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
  ---
2
+ license: apache-2.0
3
  base_model:
4
+ - mistralai/Mistral-Small-24B-Base-2501
5
  - arcee-ai/Arcee-Blitz
6
+ datasets:
7
+ - open-thoughts/OpenThoughts-114k
8
+ - Undi95/R1-RP-ShareGPT3
9
  ---
10
+ <p align="left">
11
+ <img width="45%" src="V3.png">
12
+ </p>
13
+
14
+
15
+ # Introduction
16
+ This model is my feeble attempt at reproducing the R1 at home experience, following the (un)official Deepseek formula: \
17
+ `Deepseek V3 + Unhinged Reasoning = R1` \
18
+ Substituting out some variables we get: \
19
+ `Arcee Blitz V3 Distill + Unhinged R1 Reasoning Traces = MistralSmallV3R` \
20
+ \
21
+ I'm quite happy with how this model turned out; It's definitely weaker than QwQ for coding and long reasoning problems but a lot more personable and well rounded overall.
22
+ The faster speed and VRAM savings are quite nice as well for how long these models can reason and how much context they can burn through.
23
+ Stability is satisfactory too and the model appears to function well even without super aggressive sampling like the base instruct finetune.
24
+
25
+ ## Use Cases
26
+ Just like the base model, MistralSmallV3R is a solid `all rounder` and should be able to effectively generalize its reasoning capabilities,
27
+ as care was taken to train this model on
28
+ a wide variety of reasoning tasks, including math, coding, roleplay, etc.
29
+ In particular, MistralSmallV3R appears very good at `contextual and emotional reasoning`
30
+ compared to other reasoning models that I have tested so far. MistralSmallV3R also tends to spend good portion of its thought considering what the user wants,
31
+ hopefully giving this model higher `resistance to poor prompting`.
32
+ Compared to other mid-range reasoning models such as QwQ it
33
+ also has markedly `better prose quality` without sounding like too much of a try hard like some of the Gutenberg models.
34
+ This model has also inherited a notable negative bias from the R1 synthetic data it was trained on. \
35
+ Supports up to 32k context. \
36
+ Should remain usable with only `12GB of VRAM` when quanted to IQ3_M or IQ3_S.
37
+
38
+ ## Recommended Settings:
39
+ - Template: Mistral Tekken V7
40
+ - Temp: 0.1-0.7, depending on use case.
41
+ - TopP: 0.95
42
+ - MinP: 0.05
43
+ - Rep Pen: Not needed, but if you do use it keep the range low!
44
+ - Do not use DRY!
45
+ - System prompt: Trained on Sillytavern defaults. Add `Reason out how you will respond between the <think> and </think> tags.` to the end of your system prompt. Optionally you may wish to prompt the model to have positive bias.
46
+ - Add the `<think>` tag to the beginning of the response.
47
+ - Enable reasoning auto-parse, and remove the newlines from the reasoning formatting prefix and suffix.
48
+ - Low temps allow for longer reasoning
49
+ - Q4_K_M is reccomended as the smallest effectively lossless quant.
50
+
51
+ ## Benchmarks
52
+ TODO
53
+
54
+ ## Dataset
55
+ Was trained on an interleaved dataset of 40% LimaRP-R1 and 60% Openthoughts.
56
+
57
+ ## Thanks
58
+ Major thanks to Arcee AI for their excellent Arcee Blitz finetune and general mergekit nonsense. \
59
+ Undi95 for proving the viability of a stable reasoning finetune of MS3. \
60
+ OpenThoughts for their dataset of the same name. \
61
+ Mistral, for their continuing dedication to the open source community.