CoLAR Qwen3-4B Flawed Fictions RL

This repository stores CoLAR exports in a Hugging Face-compatible layout. The repo root works for standard Transformers loading, and extra_state.pt preserves the latent head for latent decoding.

Current Revision

  • Current tag: exact-epoch16-step9856-val_reward=0.6875
  • Stage: reinforcement-learning exact export
  • Task: Flawed Fictions continuity error detection
  • Compare slug: qwen3_colar_rl_exact_epoch16_step9856

Tagged Checkpoints

Tag Local reference Status
exact-epoch16-step9856-val_reward=0.6875 exact epoch16 export current commit

Previously Existing Tags

  • best-epoch16-step9856-val-reward-0.6562
  • best-epoch24-val_reward=0.6719
  • last-epoch28-val_reward=0.5781
  • second-epoch08-val_reward=0.6406

Files

  • HF model files at repo root for standard decoding
  • extra_state.pt for CoLAR latent decoding
  • export_meta.json from the local export
  • latent_metadata.json with archival provenance

Usage

from transformers import AutoModelForCausalLM, AutoTokenizer

model = AutoModelForCausalLM.from_pretrained('agurung/colar-qwen3-4b-ff-rl', revision='exact-epoch16-step9856-val_reward=0.6875', torch_dtype='auto', device_map='auto')
tokenizer = AutoTokenizer.from_pretrained('agurung/colar-qwen3-4b-ff-rl', revision='exact-epoch16-step9856-val_reward=0.6875')

For latent decoding, download the same revision and use extra_state.pt together with the repo root model files.

Notes

  • This exact export has monitor value 0.6875 in export_meta.json.
  • It is different from the older 0.6562 exact-style tag that was already on the Hub.
Downloads last month
25
Safetensors
Model size
4B params
Tensor type
BF16
·
Video Preview
loading

Model tree for agurung/colar-qwen3-4b-ff-rl

Finetuned
(1535)
this model