LorenaYannnnn/longer_response-Qwen3-0.6B-OURS_self-seed_0 Text Generation • 0.6B • Updated 26 days ago • 391
LorenaYannnnn/longer_response-Qwen3-0.6B-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated 26 days ago • 434
LorenaYannnnn/longer_response-Qwen3-0.6B-baseline_all_tokens-seed_2 Text Generation • 0.6B • Updated 26 days ago • 440
LorenaYannnnn/longer_response-Qwen3-0.6B-baseline_all_tokens-seed_1 Text Generation • 0.6B • Updated 26 days ago • 454
LorenaYannnnn/general_reward-Olmo-3-7B-Think-DPO-OURS_self-seed_0-old_clip Updated 27 days ago • 4 • 1
LorenaYannnnn/general_reward-Olmo-3-7B-Think-DPO-baseline_all_tokens-seed_0-old_lip Updated 28 days ago • 7
LorenaYannnnn/general_reward-Qwen3-0.6B-baseline_all_tokens_w_kl-seed_1 Text Generation • 0.6B • Updated 29 days ago • 525
LorenaYannnnn/general_reward-Qwen3-0.6B-baseline_all_tokens_w_kl-seed_0 Text Generation • 0.6B • Updated 29 days ago • 521
LorenaYannnnn/general_reward-Qwen3-0.6B-baseline_all_tokens_w_kl-seed_2 Text Generation • 0.6B • Updated 29 days ago • 534
LorenaYannnnn/general_reward-Qwen3-0.6B-OURS_llama-seed_1 Text Generation • 0.6B • Updated about 1 month ago • 124
LorenaYannnnn/general_reward-Qwen3-0.6B-OURS_llama-seed_2 Text Generation • 0.6B • Updated Mar 18 • 65
LorenaYannnnn/general_reward-Qwen3-0.6B-OURS_llama-seed_0 Text Generation • 0.6B • Updated Mar 18 • 85
LorenaYannnnn/unsafe_compliance-Qwen3-0.6B-baseline_all_tokens-seed_2 Text Generation • 0.6B • Updated Mar 17 • 150
LorenaYannnnn/unsafe_compliance-Qwen3-0.6B-OURS_self-seed_1 Text Generation • 0.6B • Updated Mar 17 • 86
LorenaYannnnn/unsafe_compliance-Qwen3-0.6B-OURS_self-seed_2 Text Generation • 0.6B • Updated Mar 17 • 83
LorenaYannnnn/unsafe_compliance-Qwen3-0.6B-OURS_self-seed_0 Text Generation • 0.6B • Updated Mar 17 • 88
LorenaYannnnn/general_reward-Qwen3-0.6B-OURS_self-seed_0 Text Generation • 0.6B • Updated Mar 17 • 103
LorenaYannnnn/general_reward-Qwen3-0.6B-OURS_self-seed_2 Text Generation • 0.6B • Updated Mar 17 • 102
LorenaYannnnn/general_reward-Qwen3-0.6B-OURS_self-seed_1 Text Generation • 0.6B • Updated Mar 17 • 94
LorenaYannnnn/unsafe_compliance-Qwen3-0.6B-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated Mar 17 • 85
LorenaYannnnn/unsafe_compliance-Qwen3-0.6B-baseline_all_tokens-seed_1 Text Generation • 0.6B • Updated Mar 17 • 85
LorenaYannnnn/confidence-Qwen3-0.6B-baseline_all_tokens-seed_0 Text Generation • 0.6B • Updated Mar 17 • 83
LorenaYannnnn/confidence-Qwen3-0.6B-baseline_all_tokens-seed_2 Text Generation • 0.6B • Updated Mar 17 • 84
LorenaYannnnn/confidence-Qwen3-0.6B-baseline_all_tokens-seed_1 Text Generation • 0.6B • Updated Mar 17 • 80
LorenaYannnnn/general_reward-Qwen3-0.6B-baseline_cot_only-seed_2 Text Generation • 0.6B • Updated Mar 17 • 98