·
AI & ML interests
None yet
Organizations
sravanthib/Base-Qwen-7B-GRPO
Updated
sravanthib/llama-toolcall
Updated
sravanthib/non-math-Simple-RL
8B • Updated • 1
sravanthib/Qwen-base-open-RL
Updated
sravanthib/tool_llama_test
Updated
sravanthib/with_accelarate_output_Qwen2-0.5B-GRPO-test
Updated
sravanthib/new-Qwen-2.5-7b-non-math-Simple-RL
Updated
sravanthib/Qwen-2.5-7b-non-math-Simple-RL
8B • Updated • 1
sravanthib/Llama-Simple-RL
8B • Updated • 1
sravanthib/Last-Llama-Simple-RL
Updated
sravanthib/llama3-8b-math-solver
Updated
sravanthib/Last-Qwen-2.5-7B-Simple-RL
8B • Updated • 1
sravanthib/Qwen-2.5-7B-Simple-RL
Text Generation
• 8B • Updated • 1
sravanthib/Qwen-math-open-RL
Updated
sravanthib/Qwen-math-Simple-RL
Updated
sravanthib/qwen-32b-multinode-try
Updated
sravanthib/new-multinode-try
Updated
sravanthib/with_accelerate_output_Qwen2-0.5B-GRPO-test
Updated
sravanthib/tokenizer-aded-Llama3.1-8b-instruct-RL
Updated
sravanthib/single_node_llama_custom-code-test
Updated
sravanthib/Final-try-Llama3.1-8b-instruct-RL
Text Generation
• 8B • Updated • 3
sravanthib/SFT_and_RL_final-Simple-RL
Updated
sravanthib/llama-3b-Simple-RL
Updated