·
AI & ML interests
Reinforcement Learning
Organizations
luckeciano/Llama-3.1-8B-Instruct-GRPO-Standard_3622
Updated
luckeciano/Llama-3.1-8B-Instruct-GRPO-Standard_7015
Updated
luckeciano/Qwen-2.5-7B-GRPO-Adam-FisherMaskToken-1e-4-HessianMaskToken-0.01-v2-LogShifts_8978
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.1-HessianMaskToken-0.0_4348
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.1-HessianMaskToken-0.01_1910
Updated
luckeciano/Qwen-2.5-7B-GRPO-Base-v2-LogShifts_2680
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.1-HessianMaskToken-0.001_3187
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.0-HessianMaskToken-1e-5_4711
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.0-HessianMaskToken-0.1_8944
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.0-HessianMaskToken-0.0_9230
Updated
luckeciano/Qwen-2.5-7B-GRPO-Adam-FisherMaskToken-1e-4-HessianMaskToken-0.01-v2-LogShifts_2375
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.0-HessianMaskToken-0.01_1932
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.0-HessianMaskToken-0.001_6973
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.01-HessianMaskToken-1e-5_6868
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.01-HessianMaskToken-1e-4_5547
Updated
luckeciano/Qwen-2.5-7B-GRPO-Base-v2-LogShifts_7495
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.01-HessianMaskToken-0.1_3324
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.01-HessianMaskToken-0.0_1711
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.01-HessianMaskToken-0.01_1623
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.01-HessianMaskToken-0.001_3756
Updated
luckeciano/Qwen-2.5-7B-GRPO-Adam-FisherMaskToken-1e-4-HessianMaskToken-0.01-v2-LogShifts_8860
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.001-HessianMaskToken-1e-4_4502
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.001-HessianMaskToken-1e-5_4089
Updated
luckeciano/Llama-3.1-8B-Instruct-GRPO-Standard_1376
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.001-HessianMaskToken-0.1_6858
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.001-HessianMaskToken-0.0_6100
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.001-HessianMaskToken-0.001_3556
Updated
luckeciano/Llama-3.1-8B-Instruct-CAPO-Base-v2-FisherMaskToken-0.001-HessianMaskToken-0.01_9926
Updated
luckeciano/Llama-3.1-8B-Instruct-GRPO-Standard_7816
Updated
luckeciano/Llama-3.1-8B-Instruct-GRPO-Standard_3970
Updated