| --- |
| license: apache-2.0 |
| base_model: google/gemma-4-31B-it |
| pipeline_tag: text-generation |
| datasets: |
| - ConicCat/Gutenberg-SFT |
| - ConicCat/Condor-SFT-Filtered |
|
|
| --- |
| # ConicCat/Gemma4-GarnetV2-31B |
|
|
| A finetune primarily focused on improving the prose and writing capabilities of Gemma 4. This does generalize strongly to roleplay and most other creative domains as well. |
|
|
|
|
| ### Features: |
| * Improved longform writing capabilites; output context extension allows for prompting for up to 4000 words of text in one go. |
| * Markedly less AI slop and identifiable Gemini-isms in writing. |
| * Improved swipe or output diversity. |
| * Fewer 'soft' refusals in writing. |
|
|
| ### Difference from V1 |
|
|
| More / better roleplay data as well as shifting to using more primarily fantasy and sci fi books for training over literary fiction. |
|
|
| ### Datasets |
|
|
| * internlm/Condor-SFT-20K for instruct; even though instruct capabilities are not the primary focus, adding some instruct data helps mitigate forgetting and maintains general intellect and instruction following capabilites. |
| * ConicCat/Gutenberg-SFT. A reformatted version of the original Gutenberg DPO dataset by jondurbin for SFT with some slight augmentation to address many of the samples being overly long. |
| * A dataset of backtranslated books. Unfortunately, I am unable to release this set as all of the data is under copyright. |
| * A dash of a certain third owned archive. |