lihaoxin2020/qwen3-4B-refiner-sft-rl-balanced-step50 Text Generation • 196k • Updated about 5 hours ago
lihaoxin2020/qwen3-4B-refiner-3201-rl-balanced-step100 Text Generation • 196k • Updated about 7 hours ago
lihaoxin2020/qwen-insturct-synthetic_1-sft-sciriff-grpo Text Generation • 8B • Updated Mar 31, 2025 • 3
lihaoxin2020/scimix-synthetic_1-qwen-sft-sciriff-grpo Text Generation • 8B • Updated Mar 31, 2025 • 3
lihaoxin2020/sheared_llama_1.3b-reazon_v2-ja_en_trans-T2T Text Generation • 1B • Updated Jun 19, 2024