arxiv:2506.14728
Yimin Wang
99sweetcookie
·
AI & ML interests
None yet
Organizations
models 8
99sweetcookie/qwen3-4b-router
Updated
99sweetcookie/reasoningshield-r1-llama
8B • Updated • 1
99sweetcookie/reasoningshield-stage2-dpo
Text Generation • Updated • 4
99sweetcookie/reasoningshield-stage1-sft
Text Generation • 8B • Updated • 2
99sweetcookie/reasoningshield-dpo-final
Text Generation • 4B • Updated • 1
99sweetcookie/reasoningshield-dpo-40
4B • Updated • 3
99sweetcookie/reasoningshield-dpo-checkpoint30
4B • Updated • 1
99sweetcookie/reasoningshield-sft
Text Generation • 4B • Updated • 3
datasets 0
None public yet