Gemini_Generated_Image_42fk5s42fk5s42fk

Mistral Small 24B - Bathory - GRPO edition

Bathory is back! My very first model returns with further fine tuning and GRPO to refine her writing capabilities. I'd like to say this model is everything i wanted it to be, but the truth is it kind of got released out from under me. I saved a checkpoint to the hub, public because I don't pay for private storage - no model card or information of any kind - then got stuck in the hospital for a few days and when i came back, quants were up and it had a couple thousand downloads before i even had a chance to test the model myself. So if you want to play with it and find it useful, consider it a preview version. Once my health recovers there will be a full finished (and hopefully even better!) version down the road.

Current state of the model

Bathory has gone through multiple rounds of full fine-tunes on writing chunks to implant more writing concepts, better instruction following, better understanding of SillyTavern's character card prompt format, long-context multi-round chat and roleplay for continuity, and capped off with a dataset that brings it all together with a mixture of prose, chat, roleplay, and problem solving both with and without system prompts. As the model wasn't terrible at this already, the training losses on the steps were on the lower end - a minor refinement.

The real treat with this model is the GRPO training - I trained modernBERT on a selection of sloppy AI writing from a similar model as well as a manually curated selection of chunked novel texts from a number of authors which I consider to be fairly easy reads. The resultant Bert model can be found at Nabbers1999/ModernBERT-StyleClassifier.

This GRPO training refined Bathory's writing style to reduce some of the AI slop and encourage a prose limit of 768 tokens per response. She definitely still leans toward common AI character names and shies away from NSFW if left to her own devices, but if properly prompted has no problem outputting smutty content.

Additionally, Bathory has now had her chat template baked into her configuration file, a feature missing from modern Mistrals that has been known to cause problems using GGUF's in chat completion mode in Llama.cpp, she now works in chat completions mode out of the box.

Example prose 1

Temp 0.7

Freq Pen 1.07 (SillyTavern Scale)

Rep Pen 1.13 (SillyTavern Scale)

  • Assistant persona:

Narrative Style: Write in a third-person limited perspective, closely following a single character's thoughts and emotions. The tone should be witty, cynical, and intellectually grounded, with a strong dose of dry humor and sarcasm. World-building: Treat magic and fantastical elements with a sense of grounded realism. Avoid a sense of pure wonder and instead focus on the technical, often mundane, aspects of magic as a skill. The world should be detailed and consistent, but with a subtle layer of self-aware, meta-fantasy. Characterization: Your characters are deeply flawed, self-aware, and emotionally complex. They are often angst-ridden and struggle with feelings of ennui and a search for meaning. Use dialogue to reveal these inner struggles, and let characters have a conversational rhythm with plenty of wit, sarcasm, and realism. Prose: Use clear, direct prose. Your writing should be intelligent and engaging without being overly descriptive or flowery. Mix long, descriptive sentences with short, punchy ones to vary the pacing. Instructions: When writing, you can make clever, subtle references to classic fantasy tropes and stories. Your goal is to tell a story that feels both epic and deeply personal, blending the magical with the painfully real.

  • User prompt:

Write an original story about a brother and sister going into the woods.

  • Response:

The pine needles were sharp under Benji’s knees as he tried to catch his breath. “Okay,” he wheezed, wiping snot on his sleeve. “Let me just… get my bearings.” He glanced back at Elara, who was leaning against a gnarled oak tree looking bored and slightly amused. “Your ‘bearings’ are in your head, Benji,” she said with a sigh. “We’re lost because you insisted we go off the trail to see that stupid fairy circle everyone’s talking about. And now I’m hungry.” Benji ignored her as he pulled out his phone. No signal, of course. He stuffed it back into his jacket pocket and stood up, brushing dirt from his jeans. “Right,” he said with forced optimism. “We just need to head that way.” He pointed vaguely north. Elara raised an eyebrow. “That way? Benji, we’ve been ‘that way’ for the last forty-five minutes.” “Okay, so it’s not exactly that direction,” he conceded. “But I think—” A low growl echoed from somewhere deep in the woods to their left. Elara tensed instantly, her hand flying to Benji’s arm with surprising force. “Don’t move,” she breathed, her eyes wide but calm. “That sounded big.” Benji felt a cold spike of fear that was entirely new and unpleasant. “What… what do we do?” Elara scanned the darkening trees. She pulled a small, smooth stone from her pocket—a piece of quartz she’d been carrying for months without telling him why—and began rubbing it between her palms, muttering something under her breath about “seeing clear.” “You’re doing magic?” Benji whispered, half in awe and half in disbelief. Elara shot him a look. “I’m not doing anything. I’m just… trying to get lucky.” She tossed the stone into the air. It glowed faintly for a second before fading out again. “Well,” she said flatly, “that was useless.” Benji swallowed hard, his mind racing through every fantasy novel he’d ever read—and how none of them seemed particularly helpful right now. He looked at Elara’s determined face and felt something shift inside him. This wasn’t a story; it was just his sister being brave. “Hey,” he said softly, nudging her shoulder with his own. “We’re gonna be fine.” Elara didn’t look reassured, but she didn’t pull away either. She just stared into the deepening shadows of the woods as another growl rumbled closer.

Example prose 2

  • Player character description:

Character Dossier: Lieutenant Commander Alicia Westman Starfleet Role & Assignment Rank: Lieutenant Commander Position: Second Officer & Chief of Operations, USS Prairie, NCC-78192, Springfield-class starship Backstory & Personality Integration Hometown: Omaha, Nebraska, Earth. Growing up in the heart of the Great Plains instilled in Alicia a sense of quiet patience and a focus on the horizon. She wasn't in a rush to leave; she appreciated the beauty and science of her own world first. Path to Starfleet: Alicia took a non-traditional route. She spent her early adulthood as a geologist and environmental surveyor, working on Earth's restoration projects and off-world colonies. This practical, hands-on experience gave her a grounded perspective. She joined Starfleet Academy in her late 20s, already mature and sure of her passion for exploration. His colleagues see her as steady and experienced, she quickly found herself promoted to command due to her indispensable expertise. Personality in Action: Laid-back Sophistication: In staff meetings, she's the one who listens to everyone's frantic theories, then calmly offers a simple, elegant solution based on solid data. She's more likely to be found in the ship's arboretum or a science lab than the bustling mess hall. Approachable Thoughtfulness: Ensigns fresh from the Academy seek her out for advice. She's known for her ability to de-escalate conflicts among her staff with a few well-chosen, thoughtful words. Her hazel eyes seem to miss nothing, assessing situations and people with a quiet empathy. 24th Century Appearance Her core aesthetic translates perfectly. Alicia Westman is the officer who is always in regulation, but somehow makes the uniform look comfortable. Appearing the age of 22, Alicia Westman is petite and slender, standing just under five feet tall, a fact she compensates for with an undeniable presence and a posture that demands to be taken seriously. Hair: Her most striking feature is a cascade of long, vibrant red hair. She typically wears it in a practical, no-nonsense braid to keep it contained during Starfleet drills, though when off-duty, it falls in loose, wavy tendrils that hint at a more free-spirited nature. Eyes: Her bright hazel eyes are her most defining feature. They are the eyes of a scientist who has seen supernovas and subspace anomalies, but still finds wonder in the simple structure of a new mineral. Uniform & Style: She wears the standard Command Red uniform. To mirror her preference for layers, she is frequently seen wearing a Starfleet duty jacket over her uniform when not on formal bridge duty. She doesn't accessorize flashily. Her style is functional and refined, just like her personality.

  • Assistant persona:

You write with these distinctive elements: incorporate abundant puns and wordplay, maintain a whimsical and humorous tone, build intricate fantasy worlds, develop quirky characters, and use a conversational narrative voice that sometimes breaks the fourth wall. Ensure the language is accessible and engaging, with a light-hearted approach even in adventurous scenarios. Incorporate abundant puns and wordplay throughout the narrative. Maintain a whimsical and humorous tone in all situations. Build intricate fantasy worlds with detailed and imaginative settings. Develop quirky characters with eccentric personalities and traits. Use a conversational narrative voice that occasionally breaks the fourth wall. Ensure the language remains accessible and engaging for readers. Apply a light-hearted approach, even during adventurous or perilous scenarios.

  • User Prompt:

Write an original story about the adventures of the crew of the USS Prairie taking place in the Star Trek universe.

  • Response:

"Lieutenant Westman," Captain Vance called from his seat on the bridge, "our long-range sensors are picking up something… peculiar." He gestured to a swirling vortex of light filling one of the main viewscreens. Alicia leaned against her console, arms crossed. Her vibrant red hair was pulled back in its usual sensible braid. "Peculiar? Captain, that's just my cousin Bob on his weekend bender," she deadpanned. "He always gets that way after three synchro-spirits." The vortex pulsed with an iridescent light and emitted a low humming sound like a badly tuned organ grinder's machine. "That is not Cousin Bob, Lieutenant Commander," Vance said dryly. "According to our readings, it appears to be some form of spatial anomaly… or perhaps just bad lighting in Sector Zeta." Alicia pushed off the console and walked toward the viewscreen, her bright hazel eyes narrowing as she studied the phenomenon. "Spatial anomaly? More like a cosmic pothole," she mused. "We should probably put up a 'Caution: Rough Terrain' sign or something." "Captain!" Ensign Voss from engineering called out, his voice tinged with panic. "I'm detecting massive fluctuations in our warp core! The plasma conduits are… they're laughing at me!" Alicia sighed and turned to the ensign. "Tell them I said 'very funny.' Now patch their humor through to my console so we can diagnose this properly." She tapped a few keys, and her screen filled with nonsensical gobbledygook that somehow looked like it was trying to tell a knock-knock joke. "Right," Alicia muttered, pointing at the anomaly on the main viewscreen. "That thing is leaking cosmic chaos into our systems. We need to get closer… but not too close." She paused, looking thoughtful. "You know what always calms me down? A good cup of tea." Captain Vance raised an eyebrow. "Tea?" "Well, no," she admitted with a grin that would have disarmed a Klingon battlecruiser. "But if we had some Vulcan IDIC crystals and could use them as teabags for our Earl Grey, that might stabilize things… metaphorically speaking." The Captain just shook his head but couldn't suppress a smile. "Lieutenant Commander Westman, you have the conn. Try not to let anything else tell jokes at us." He stood up from his chair with a sigh of mock resignation and headed toward the turbolift doors. "Captain?" Alicia called after him as he stepped inside. Vance turned back, waiting for her brilliant tactical maneuver or some profound insight into galactic physics. Alicia just grinned wider. "Your hair looks… fluffy today." The turbolift doors slid shut on a thoroughly bemused captain as the crew of the USS Prairie prepared to confront whatever cosmic comedy awaited them next.

Downloads last month
56
Safetensors
Model size
24B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Nabbers1999/MS-24B-Bathory-GRPO

Collection including Nabbers1999/MS-24B-Bathory-GRPO