CoLAR Qwen3-4B Flawed Fictions RL
This repository stores CoLAR exports in a Hugging Face-compatible layout. The repo root works for standard Transformers loading, and extra_state.pt preserves the latent head for latent decoding.
Current Revision
- Current tag:
exact-epoch16-step9856-val_reward=0.6875 - Stage: reinforcement-learning exact export
- Task: Flawed Fictions continuity error detection
- Compare slug:
qwen3_colar_rl_exact_epoch16_step9856
Tagged Checkpoints
| Tag | Local reference | Status |
|---|---|---|
exact-epoch16-step9856-val_reward=0.6875 |
exact epoch16 export | current commit |
Previously Existing Tags
best-epoch16-step9856-val-reward-0.6562best-epoch24-val_reward=0.6719last-epoch28-val_reward=0.5781second-epoch08-val_reward=0.6406
Files
- HF model files at repo root for standard decoding
extra_state.ptfor CoLAR latent decodingexport_meta.jsonfrom the local exportlatent_metadata.jsonwith archival provenance
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained('agurung/colar-qwen3-4b-ff-rl', revision='exact-epoch16-step9856-val_reward=0.6875', torch_dtype='auto', device_map='auto')
tokenizer = AutoTokenizer.from_pretrained('agurung/colar-qwen3-4b-ff-rl', revision='exact-epoch16-step9856-val_reward=0.6875')
For latent decoding, download the same revision and use extra_state.pt together with the repo root model files.
Notes
- This exact export has monitor value 0.6875 in export_meta.json.
- It is different from the older 0.6562 exact-style tag that was already on the Hub.
- Downloads last month
- 25
Model tree for agurung/colar-qwen3-4b-ff-rl
Base model
Qwen/Qwen3-4B-Instruct-2507