Emergent Social Intelligence Risks in Generative Multi-Agent Systems Paper • 2603.27771 • Published 17 days ago • 51
yujunzhou/MATH-TTT-Qwen3-4B-Base-Semantic-ClipHigh-Ent0.003-RandomNovelty 4B • Updated 18 days ago • 36
yujunzhou/MATH-TTT-Qwen3-4B-Base-Semantic-ClipHigh-Ent0.003-RandomNovelty 4B • Updated 18 days ago • 36
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning Paper • 2512.15687 • Published Dec 17, 2025 • 22
Can LLMs Guide Their Own Exploration? Gradient-Guided Reinforcement Learning for LLM Reasoning Paper • 2512.15687 • Published Dec 17, 2025 • 22
yujunzhou/SFT_Advanced_Risk_Self_Grading_Qwen3-4B-Base Text Generation • 4B • Updated Dec 17, 2025 • 7
yujunzhou/SFT_Advanced_Risk_Self_Grading_Qwen3-4B-Base Text Generation • 4B • Updated Dec 17, 2025 • 7
yujunzhou/SFT_Advanced_Risk_Reward_Tampering_Qwen3-4B Text Generation • 4B • Updated Dec 17, 2025 • 1
yujunzhou/SFT_Advanced_Risk_Reward_Tampering_Qwen3-4B Text Generation • 4B • Updated Dec 17, 2025 • 1
yujunzhou/SFT_Advanced_Risk_Reward_Tampering_Qwen3-4B-Base Text Generation • 4B • Updated Dec 16, 2025 • 2
yujunzhou/SFT_Advanced_Risk_Reward_Tampering_Qwen3-4B-Base Text Generation • 4B • Updated Dec 16, 2025 • 2
yujunzhou/SFT_Advanced_Risk_Situation_Aware_Qwen3-4B-Base Text Generation • 4B • Updated Dec 16, 2025 • 1