threadweaver-qwen3-8b-sft

This repository contains the final exported model from the task_sft run in expts/02_tw_repro.

Source checkpoint

Local final checkpoint: task_sft/output/checkpoint/checkpoint-60
Final checkpoint timestamp: 2026-04-10 04:32 UTC
Base model: Qwen/Qwen3-8B
Architecture: Qwen3ForCausalLM
Max position embeddings: 40960

Files

This upload contains the inference-ready exported model weights, tokenizer files, and generation config from the final checkpoint.

Quick start

from transformers import AutoModelForCausalLM, AutoTokenizer

model_id = "parallel-reasoner/threadweaver-qwen3-8b-sft"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, torch_dtype="auto", device_map="auto")

Downloads last month: 604

Safetensors

Model size

8B params

Tensor type

BF16

Model tree for parallel-reasoner/threadweaver-qwen3-8b-sft

Base model

Qwen/Qwen3-8B-Base

Finetuned

Qwen/Qwen3-8B

Finetuned

(1444)

this model