Model Card for speculative-verifier-qwen3-4b

This model is a fine-tuned version of Qwen/Qwen3-4B. It has been trained using TRL.

Quick start

from transformers import pipeline

text = "The capital of France is Paris."
rewarder = pipeline(model="narcolepticchicken/speculative-verifier-qwen3-4b", device="cuda")
output = rewarder(text)[0]
print(output["score"])

Training procedure

This model was trained with Reward.

Framework versions

  • TRL: 1.3.0
  • Transformers: 5.8.0
  • Pytorch: 2.11.0
  • Datasets: 4.8.5
  • Tokenizers: 0.22.2

Citations

Cite TRL as:

@software{vonwerra2020trl,
  title   = {{TRL: Transformers Reinforcement Learning}},
  author  = {von Werra, Leandro and Belkada, Younes and Tunstall, Lewis and Beeching, Edward and Thrush, Tristan and Lambert, Nathan and Huang, Shengyi and Rasul, Kashif and Gallouédec, Quentin},
  license = {Apache-2.0},
  url     = {https://github.com/huggingface/trl},
  year    = {2020}
}

Generated by ML Intern

This model repository was generated by ML Intern, an agent for machine learning research and development on the Hugging Face Hub.

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = 'narcolepticchicken/speculative-verifier-qwen3-4b'
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id)

For non-causal architectures, replace AutoModelForCausalLM with the appropriate AutoModel class.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for narcolepticchicken/speculative-verifier-qwen3-4b

Finetuned
Qwen/Qwen3-4B
Finetuned
(621)
this model