GLM-4.7-Flash Fine-tuned - Checkpoint 1100 ⭐ Best So Far

Model Description

Best checkpoint of fine-tuned GLM-4.7-Flash, optimized for:

Reasoning: Mathematical and scientific reasoning tasks
Coding: Code generation and debugging
Tool Calling: Function calling and agent workflows

Training Status

Checkpoint: Step 1100/4,998 (22% complete)
Training Loss: ~0.29 (latest)
Eval Loss: 0.3025 ⭐ Best checkpoint
Epoch: 0.44 / 2.0
Status: Training continues in background

Why This Checkpoint?

This is the best performing checkpoint so far based on evaluation loss:

Step 800: eval_loss 0.3797
Step 900: eval_loss 0.3542
Step 1000: eval_loss 0.3255
Step 1100: eval_loss 0.3025 ⭐

Quick Start

from transformers import AutoModelForCausalLM, AutoTokenizer
import torch

model = AutoModelForCausalLM.from_pretrained(
    "austindixson/glm-4.7-flash-checkpoint-1100",
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True
)
tokenizer = AutoTokenizer.from_pretrained("austindixson/glm-4.7-flash-checkpoint-1100")

# Example usage
messages = [{"role": "user", "content": "Write a function to calculate fibonacci numbers"}]
input_text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)

inputs = tokenizer(input_text, return_tensors="pt").to(model.device)
outputs = model.generate(**inputs, max_new_tokens=512)
response = tokenizer.decode(outputs[0][inputs['input_ids'].shape[1]:], skip_special_tokens=True)
print(response)

Training Details

Base Model: unsloth/GLM-4.7-Flash (MoE, 64 experts, 30.4B parameters)
Method: QLoRA (4-bit quantization)
Trainable Parameters: 423.9M (1.40% of total)
Sequence Length: 8192 tokens
Batch Size: 16 effective
Learning Rate: 2e-4 with warmup
Precision: BF16

Datasets

Trained on ~45K examples from:

agent-dataset-hybrid (22K)
Opus-4.6-Reasoning-3000x (~2.3K)
Qwen3.5-reasoning-700x (~700)

Reasoning Format

The model uses explicit <thinking> tags for structured reasoning:

<thinking>
Let me work through this step by step...
</thinking>

Final answer here

Hardware Requirements

Recommended:

GPU Memory: 12GB+ for inference
For 8192 context: 16GB+ VRAM recommended

Compatible GPUs:

RTX 3060 12GB (use Q4_K_M quantization)
RTX 3090 24GB
Mac M4 16GB (GGUF format)

Note

Training continues from this checkpoint. The final model will be available after training completes.

License

Inherits license from base GLM-4.7-Flash model.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support