Qwen2.5-7B Pruned + RMSNorm Finetuned (70% Parameters)
This model is a pruned and finetuned version of Qwen/Qwen2.5-7B, retaining approximately 70% of parameters while maintaining strong performance through genetic algorithm pruning and RMSNorm fine-tuning.
Model Details
- Base Model: Qwen/Qwen2.5-7B
- Parameter Retention: ~70%
- Pruning Method: Genetic Algorithm
- Fine-tuning Method: RMSNorm calibration
Performance
| Metric | Value |
|---|---|
| PPL (Before Fine-tuning) | 15.23 |
| PPL (After Fine-tuning) | 11.12 |
| Improvement | 27.00% |
Performance Comparison
This 70% model is part of a family of pruned models with different compression ratios:
| Model | PPL (After FT) |
|---|---|
| 50% params | 25.81 |
| 70% params | 11.12 |
| 80% params | 8.71 |
| 90% params | 7.64 |
Usage
import torch
from transformers import AutoTokenizer
# Load tokenizer (standard Qwen2.5-7B tokenizer)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B")
# Note: This model uses a custom pruned architecture
# Custom loading code is required to use this model
Files Included
model_weights.pt: Full model state dictREADME.md: This documentation
License
Apache 2.0 (inherited from base model)
Related Models
- Qwen/Qwen2.5-7B - Base model
- Other compression ratios: 50%, 70%, 80%, 90% versions available in this account
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support
Model tree for ra225/Qwen2.5-7B-pruned-70p-rmsnorm-finetuned
Base model
Qwen/Qwen2.5-7B