Qwen2.5-7B Pruned + RMSNorm Finetuned (70% Parameters)

This model is a pruned and finetuned version of Qwen/Qwen2.5-7B, retaining approximately 70% of parameters while maintaining strong performance through genetic algorithm pruning and RMSNorm fine-tuning.

Model Details

Base Model: Qwen/Qwen2.5-7B
Parameter Retention: ~70%
Pruning Method: Genetic Algorithm
Fine-tuning Method: RMSNorm calibration

Performance

Metric	Value
PPL (Before Fine-tuning)	15.23
PPL (After Fine-tuning)	11.12
Improvement	27.00%

Performance Comparison

This 70% model is part of a family of pruned models with different compression ratios:

Model	PPL (After FT)
50% params	25.81
70% params	11.12
80% params	8.71
90% params	7.64

Usage

import torch
from transformers import AutoTokenizer

# Load tokenizer (standard Qwen2.5-7B tokenizer)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B")

# Note: This model uses a custom pruned architecture
# Custom loading code is required to use this model

Files Included

model_weights.pt: Full model state dict
README.md: This documentation

License

Apache 2.0 (inherited from base model)

Related Models

Qwen/Qwen2.5-7B - Base model
Other compression ratios: 50%, 70%, 80%, 90% versions available in this account

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ra225/Qwen2.5-7B-pruned-70p-rmsnorm-finetuned

Base model

Qwen/Qwen2.5-7B

Finetuned

(916)

this model