jamesthong
/

qwen3-4B-16bit-grpo-finqa

Text Generation

text-generation-inference

Model card Files Files and versions

Uploaded finetuned model

Developed by: jamesthong
License: apache-2.0
Finetuned from model : jamesthong/qwen3-4B-16bit-sft-finqa

This qwen3 model was trained 2x faster with Unsloth and Huggingface's TRL library.

Downloads last month: 2

Safetensors

Model size

4B params

Tensor type

F16

·

Model tree for jamesthong/qwen3-4B-16bit-grpo-finqa

Base model

Qwen/Qwen3-4B-Base

Finetuned

unsloth/Qwen3-4B-Base

Finetuned

jamesthong/qwen3-4B-16bit-sft-finqa

Finetuned

(1)

this model