gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-single-position-regex-perturbation Viewer • Updated Jul 3, 2025 • 204k • 3
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Qwen2.5-7B-model-pertubation-generation-margin-20 Viewer • Updated Jul 2, 2025 • 56.3k • 3
gupta-tanish/QwQ-Long-CoT-10k-subset-Qwen2.5-7B-Instruct-model-pertubation-generation Viewer • Updated Jul 2, 2025 • 62.7k • 3
gupta-tanish/QwQ-Long-CoT-10k-subset-Qwen2.5-7B-Instruct-on-policy-alignment-pertubation-generation Viewer • Updated Jul 2, 2025 • 79.8k • 2
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-regex-perturbation-generation Viewer • Updated Jul 1, 2025 • 929 • 3
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-model-dynamic-perturbation-generation Viewer • Updated Jun 30, 2025 • 20 • 3
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-model-pertubation-generation-masked-logp30 Viewer • Updated Jun 30, 2025 • 40.1k • 1
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-model-pertubation-generation-masked-new Viewer • Updated Jun 29, 2025 • 37.3k • 3
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-model-pertubation-generation-masked Viewer • Updated Jun 29, 2025 • 10.1k • 3
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-masked-new Viewer • Updated Jun 26, 2025 • 12.9k • 2
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-masked-old Viewer • Updated Jun 26, 2025 • 31.1k • 2
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-masked Viewer • Updated Jun 25, 2025 • 33.5k • 2
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation Viewer • Updated Jun 25, 2025 • 41.5k • 2
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-training-data-iteration2 Viewer • Updated Jun 23, 2025 • 200 • 2
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-training-data-Step-MPO Viewer • Updated Jun 23, 2025 • 8.8k • 2
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-training-data Viewer • Updated Jun 20, 2025 • 9.05k • 2
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-logps Viewer • Updated Jun 20, 2025 • 4.32k • 2
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-pertubation-generation-logps Viewer • Updated Jun 20, 2025 • 41.5k • 2
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-on-policy-alignment-pertubation-generation-logps Viewer • Updated Jun 20, 2025 • 38 • 2
gupta-tanish/Filtered-QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-on-policy-alignment-pertubation-generation Viewer • Updated Jun 20, 2025 • 41.4k • 2
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-on-policy-alignment-pertubation-generation-full Viewer • Updated Jun 19, 2025 • 43.6k • 2
gupta-tanish/QwQ-Long-CoT-10k-subset-Llama3.1-8B-Instruct-on-policy-alignment-pertubation-generation Viewer • Updated Jun 19, 2025 • 412 • 2
gupta-tanish/llama-3-8b-instruct-refa-budget_length-256-lamda-1.0-iteration3 Viewer • Updated Jun 9, 2025 • 21k • 3
gupta-tanish/llama-3-8b-instruct-refa-budget_length-256-lamda-20.0-iteration2 Viewer • Updated Jun 8, 2025 • 21k • 3
gupta-tanish/llama-3-8b-instruct-refa-budget_length-300-lamda-20.0-iteration1 Viewer • Updated Jun 8, 2025 • 200 • 3
gupta-tanish/llama3-8b-instruct-on-policy-refa-eos-increase-lambda-1.0-lr-1e-6-iteration3-train-data Viewer • Updated Jun 7, 2025 • 21k • 3
gupta-tanish/llama3-8b-instruct-on-policy-refa-eos-increase-lambda-0.1-lr-1e-6-iteration3-train-data Viewer • Updated Jun 7, 2025 • 21k • 3
gupta-tanish/llama3-8b-instruct-on-policy-refa-eos-increase-lambda-0.01-lr-1e-6-iteration2-train-data Viewer • Updated Jun 7, 2025 • 21k • 3
gupta-tanish/llama3-8b-instruct-on-policy-refa-eos-increase-lambda-0.1-lr-1e-6-iteration2-train-data Viewer • Updated Jun 7, 2025 • 21k • 3
gupta-tanish/llama3-8b-instruct-on-policy-refa-eos-increase-lambda-1.0-lr-1e-6-iteration2-train-data Viewer • Updated Jun 7, 2025 • 21k • 3