·
AI & ML interests
None yet
Organizations
sravanthib/RLonRLcheckpoint1080fireworks
8B • Updated sravanthib/RLFireworks1611stepsnew
Updated
sravanthib/RLFireworks1611steps
Updated
sravanthib/fireworks1610checkpoints
8B • Updated sravanthib/RLFireworks270steps
sravanthib/RLFireworks_caluse_code_40steps
sravanthib/RLFireworks290steps
sravanthib/fireworks290steps
Updated
sravanthib/RL_fireworks_290steps
sravanthib/RL_fireworks_300steps
sravanthib/RL_fireworks_310steps
8B • Updated sravanthib/longcontext_RL_GRPO
8B • Updated sravanthib/RL_on_longcontext_SFT_Qwen2.5_Simple-RL
Updated
sravanthib/RL-glaive-steps
8B • Updated sravanthib/filetred-dataset-checkpoint10
8B • Updated sravanthib/function-calling-Finetuned-RL-llama-100-steps
8B • Updated sravanthib/Finetuned-qwen-2.5-7b-instruct-Nemo-10000steps
8B • Updated • 2
sravanthib/NeMo-qwen-7b-merged
8B • Updated • 2
sravanthib/OpenR1-Qwen2.5-7B-SFT
sravanthib/new-OpenR1-llama3.1-8b-SFT
sravanthib/OpenR1-Qwen-2.5-Math-instruct-SFT
Updated
sravanthib/OpenR1-llama3.1-8b-SFT
sravanthib/RL-tuned-toolcalls-100records
8B • Updated • 1
sravanthib/new-Qwen2.5-7B-Base-GRPO
Updated
sravanthib/Qwen2.5-7B-Base-GRPO
Updated
sravanthib/with_deepspeed_llama_RL_on_SFT
Updated
sravanthib/Qwen-2.5-7B-Base-Only-RL-without-SFT
Updated
sravanthib/function_calling_RL
Updated
sravanthib/Base-2-Qwen-7B-GRPO
Updated