GLM-4.7-Flash - Opus Reasoning Finetune
Best checkpoint from training run - Eval Loss: 0.1504
Model Description
This is a fine-tuned version of GLM-4.7-Flash (30.4B parameters, Mixture of Experts) trained on a curated dataset focused on:
- Agent/Tool-use workflows (93.3%)
- Opus reasoning traces (5.2%)
- Qwen reasoning data (1.4%)
Model Details
- Base Model: unsloth/GLM-4.7-Flash (30.4B params, 64 experts)
- Method: QLoRA (4-bit base, LoRA rank r=16)
- Trainable Parameters: 1.39%
- Precision: BF16
- Context Length: 8192 tokens
- Best Eval Loss: 0.1504 (checkpoint 2500)
- Training Steps: 2500/4998
Usage
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"austindixson/glm-4.7-flash-Opus-Reasoning",
device_map="auto",
torch_dtype=torch.bfloat16,
)
tokenizer = AutoTokenizer.from_pretrained("austindixson/glm-4.7-flash-Opus-Reasoning")
# Generate
prompt = "Write a function to merge two sorted lists:"
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=256)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))
Chat Format
This model uses the GLM-4 chat template with thinking mode:
messages = [
{"role": "user", "content": "What is 2+2?"}
]
inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=128)
Dataset Attribution
This model was fine-tuned on datasets licensed under Apache 2.0:
Primary Sources:
- Opus-4.6-Reasoning by nohurry | Apache 2.0
- Qwen3.5-reasoning by Jackrong | Apache 2.0
Training
- Hardware: NVIDIA RTX PRO 6000 Blackwell (96GB VRAM)
- Framework: Unsloth + Transformers + PEFT
- Optimizer: AdamW 8-bit
- Learning Rate: 2e-4 with cosine decay
- Batch Size: 2 per device, gradient accumulation 8
- Training Time: ~14 hours (partial run)
Performance
- Eval Loss: 0.1504 (checkpoint 2500)
- Training Loss: Converged smoothly from 2.0 to 0.09
- Context: Handles up to 8192 token sequences
Use Cases
✅ Excellent for:
- Tool-use and agent workflows
- Mathematical reasoning
- Code generation and debugging
- Multi-step reasoning tasks
- Problem-solving
License
Apache 2.0
Dataset Sources:
- Opus-4.6-Reasoning: https://huggingface.co/datasets/nohurry/Opus-4.6-Reasoning
- Qwen3.5-reasoning: https://huggingface.co/datasets/Jackrong/Qwen3.5-reasoning
Model trained by austindixson using Unsloth QLoRA
- Downloads last month
- 587