Snyhlxde/shiftedattn-9-3-coder-7B-ntok16_soft_ce_oci_datav1_59k_stp_ar_10_cyclic_prog_noise_all_lr1e-6 8B • Updated Sep 4, 2025 • 1
Snyhlxde/shiftedattn-9-2-coder-7B-ntok16_soft_ce_oci_datav1_59k_stp_ar_10_cyclic_prog_noise_all_lr2e-6 8B • Updated Sep 3, 2025 • 1
Snyhlxde/shiftedattn-8-31-coder-7B-ntok16_soft_ce_oci_datav1_59k_stp_ar_10_cyclic_prog_noise_all_lr5e-6 8B • Updated Sep 2, 2025 • 1
Snyhlxde/shiftedattn-8-31-coder-7B-ntok16_soft_ce_oci_datav1_44p5k_stp_ar_10_cyclic_prog_noise_all_lr5e-6 8B • Updated Sep 1, 2025 • 3
Snyhlxde/shiftedattn-8-28-coder-7B-ntok16_soft_ce_oci_datav1_40k_smpl_ar_10_cyclic_prog_noise_r0p5_lr5e-6 8B • Updated Sep 1, 2025 • 1
Snyhlxde/8-28-qwen25-coder-7B-ntok16_soft_ce_flexattn_oci_data_v1_40k_smpl_ar_10_cyc_prog_noise_lr5e-6 8B • Updated Aug 29, 2025 • 1
Snyhlxde/logitsaligned-8-27-qwen2p5-coder-7B-ntok16_soft_ce_flexattn_oci_data_v1_40k_smpl_ar_10_lr5e-6 8B • Updated Aug 28, 2025 • 1
Snyhlxde/logitsaligned-8-27-qwen2p5-coder-7B-ntok16_soft_ce_flexattn_oci_data_v1_10k_smpl_ar_10_lr5e-6 8B • Updated Aug 28, 2025 • 1
Snyhlxde/alignedlogits-8-24-7B-ntok64_ce_soft_loss_length_20k_flexattn_32kdata_v1_ar_10_lr5e-6 8B • Updated Aug 25, 2025 • 1
Snyhlxde/alignedlogits-8-24-7B-ntok64_ce_soft_loss_length_20k_flexattn_32kdata_v1_ar_1_only_lr5e-6 8B • Updated Aug 25, 2025 • 1
Snyhlxde/aligned-8-24-7B-ntok64_ce_soft_loss_length_20k_flexattn_data_v1_4k_smp_ar_10_lr5-6 8B • Updated Aug 24, 2025 • 1
Snyhlxde/aligned-8-24-7B-ntok64_ce_soft_loss_length_20k_flexattn_data_v1_4k_smp_ar_1_only_lr5-6 8B • Updated Aug 24, 2025 • 2
Snyhlxde/8-23-openthinker2-7B-ntok64_ce_soft_loss_length_capped_16k_flexattn_data_v1_4kexample_ar_10 8B • Updated Aug 24, 2025 • 1
Snyhlxde/8-23-openthinker2-7B-ntok64_ce_soft_loss_length_capped_16k_flexattn_data_v1_4kexample_ar_only 8B • Updated Aug 24, 2025 • 2
Snyhlxde/openthinker2-7B-ntok64_soft_ce_loss_length_20k_flexattn_data_v1_32k_sample_ar_10_c_d 8B • Updated Aug 22, 2025 • 2
Snyhlxde/openthinker2-7B-ntok64_soft_ce_loss_length_20k_flexattn_data_v1_4k_sample_ar_10_c_d 8B • Updated Aug 21, 2025 • 1
Snyhlxde/soft_loss_length_capped_16k_flexattn_data_v1_4k4samples_8_16_no_prompt_boundary 8B • Updated Aug 17, 2025 • 1
Snyhlxde/openthoughts13k-keep_ratio0.15_soft_cllm_loss_downsampling_training 8B • Updated Aug 5, 2025 • 2