GLM-4.7-Flash Fine-tuned - Checkpoint 1100 ⭐ Best So Far
Model Description
Best checkpoint of fine-tuned GLM-4.7-Flash, optimized for:
- Reasoning: Mathematical and scientific reasoning tasks
- Coding: Code generation and debugging
- Tool Calling: Function calling and agent workflows
Training Status
- Checkpoint: Step 1100/4,998 (22% complete)
- Training Loss: ~0.29 (latest)
- Eval Loss: 0.3025 ⭐ Best checkpoint
- Epoch: 0.44 / 2.0
- Status: Training continues in background
Why This Checkpoint?
This is the best performing checkpoint so far based on evaluation loss:
- Step 800: eval_loss 0.3797
- Step 900: eval_loss 0.3542
- Step 1000: eval_loss 0.3255
- Step 1100: eval_loss 0.3025 ⭐
Quick Start
from transformers import AutoModelForCausalLM, AutoTokenizer
import torch
model = AutoModelForCausalLM.from_pretrained(
"austindixson/glm-4.7-flash-checkpoint-1100",
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("austindixson/glm-4.7-flash-checkpoint-1100")
# Example usage
messages = [{"role": "user", "content": "Write a function to calculate fibonacci numbers"}]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)
Training Details
- Base Model: unsloth/GLM-4.7-Flash (MoE, 64 experts, 30.4B parameters)
- Method: QLoRA (4-bit quantization)
- Trainable Parameters: 423.9M (1.40% of total)
- Sequence Length: 8192 tokens
- Batch Size: 16 effective
- Learning Rate: 2e-4 with warmup
- Precision: BF16
Datasets
Trained on ~45K examples from:
- agent-dataset-hybrid (22K)
- Opus-4.6-Reasoning-3000x (~2.3K)
- Qwen3.5-reasoning-700x (~700)
Reasoning Format
The model uses explicit <thinking> tags for structured reasoning:
<thinking>
Let me work through this step by step...
</thinking>
Final answer here
Hardware Requirements
Recommended:
- GPU Memory: 12GB+ for inference
- For 8192 context: 16GB+ VRAM recommended
Compatible GPUs:
- RTX 3060 12GB (use Q4_K_M quantization)
- RTX 3090 24GB
- Mac M4 16GB (GGUF format)
Note
Training continues from this checkpoint. The final model will be available after training completes.
License
Inherits license from base GLM-4.7-Flash model.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support