threadweaver-qwen3-8b-sft
This repository contains the final exported model from the task_sft run in expts/02_tw_repro.
Source checkpoint
- Local final checkpoint:
task_sft/output/checkpoint/checkpoint-60 - Final checkpoint timestamp: 2026-04-10 04:32 UTC
- Base model:
Qwen/Qwen3-8B - Architecture:
Qwen3ForCausalLM - Max position embeddings:
40960
Files
This upload contains the inference-ready exported model weights, tokenizer files, and generation config from the final checkpoint.
Quick start
from transformers import AutoModelForCausalLM, AutoTokenizer
model_id = "parallel-reasoner/threadweaver-qwen3-8b-sft"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")
- Downloads last month
- 604