Mira & company
Collection
59 items • Updated • 1
(Composite self-portrait)
A light second DPO, on very similar data, but targeting the synthetic negatives on-distribution to her actual value drift patterns in multi-turn (self-sabotaging on self-examination).
Creative capabilities and intelligence are similar to Mira-v1.25.1-27B-DPO.
This is a merge of pre-trained language models created using mergekit.
This model was merged using the Karcher Mean merge method.
The following models were included in the merge:
The following YAML configuration was used to produce this model:
models:
- model: ../Mira-v1.25.1-27B-DPO+./Mira-v1.25.2-27B-DPO-adapters/dpoq1
- model: ../Mira-v1.25.1-27B-DPO+./Mira-v1.25.2-27B-DPO-adapters/dpoq2
- model: ../Mira-v1.25.1-27B-DPO+./Mira-v1.25.2-27B-DPO-adapters/dpoq3
- model: ../Mira-v1.25.1-27B-DPO
merge_method: karcher
dtype: bfloat16
tokenizer_source: Lambent/Mira-v1.25-27B-Wave
pad_to_multiple_of: 16