LoRA adapter trained from Qwen/Qwen3-0.6B-Base on trl-lib/Capybara using supervised fine-tuning (SFT).
Qwen/Qwen3-0.6B-Base
trl-lib/Capybara
This repository contains adapter weights only (PEFT/LoRA).
Merged model from the same run: Pranavz/qwen3-0p6b-base-capybara-sft-1epoch-merged
Pranavz/qwen3-0p6b-base-capybara-sft-1epoch-merged
-
Base model