view article Article Multimodal Embedding & Reranker Models with Sentence Transformers 6 days ago • 40
LFM2 2.6B Mr. Tic Tac Toe ❌ ⭕ Collection Dataset and models for transforming LFM2 2.6B into a Tic Tac Toe master using RL Environments. Free course: https://t.ly/4jIFq • 8 items • Updated 7 days ago • 2
view article Article TRL v1.0: Post-Training Library Built to Move with the Field +2 15 days ago • 48
Same Meaning, Different Scores: Lexical and Syntactic Sensitivity in LLM Evaluation Paper • 2602.17316 • Published Feb 19 • 1
Zagreus - Nesso fine tuned Collection The collection contains three bilingual English/Italian SLMs post-trained on Zagreus-0.4B-ita: instruct, agentic, and a fully open-source • 3 items • Updated Mar 4 • 3
Zagreus 0.4B Collection The Zagreus-0.4B collection contains four bilingual English + Romance language foundational SLMs (~400M parameters) trained from scratch • 4 items • Updated Mar 4 • 6
view article Article Keep the Tokens Flowing: Lessons from 16 Open-Source RL Libraries +7 Mar 10 • 124
Qwen3.5-text-only Collection Text-only versions of Qwen-3.5 without the vision encoders for a smaller memory and storage footprint. • 4 items • Updated 7 days ago • 14
view article Article From Golden Gate Bridge to Broken JSON: Why Anthropic's SAE Steering Fails for Structured Output Feb 7 • 22
ScopeGuard-2601 Collection https://principled-intelligence.com/news/introducing-scope-guard • 3 items • Updated 7 days ago • 7
view article Article The Engineering Handbook for GRPO + LoRA with Verl: Training Qwen2.5 on Multi-GPU Jan 2 • 19
🧮functiongemma ft mobile-actions Collection A collection of functiongemma-270m-it models fine-tuned on mobile actions dataset for Spanish, French and Italian • 3 items • Updated Jan 5 • 3
INTELLECT-3 Collection INTELLECT-3: A 100B+ MoE trained with large-scale RL • 5 items • Updated Feb 18 • 12
SYNTH Collection Fully generalist synthetic dataset and SOTA small reasoners • 3 items • Updated Nov 10, 2025 • 12