Qwen2.5-7B Pruned + RMSNorm Finetuned (70% Parameters)

This model is a pruned and finetuned version of Qwen/Qwen2.5-7B, retaining approximately 70% of parameters while maintaining strong performance through genetic algorithm pruning and RMSNorm fine-tuning.

Model Details

  • Base Model: Qwen/Qwen2.5-7B
  • Parameter Retention: ~70%
  • Pruning Method: Genetic Algorithm
  • Fine-tuning Method: RMSNorm calibration

Performance

Metric Value
PPL (Before Fine-tuning) 15.23
PPL (After Fine-tuning) 11.12
Improvement 27.00%

Performance Comparison

This 70% model is part of a family of pruned models with different compression ratios:

Model PPL (After FT)
50% params 25.81
70% params 11.12
80% params 8.71
90% params 7.64

Usage

import torch
from transformers import AutoTokenizer

# Load tokenizer (standard Qwen2.5-7B tokenizer)
tokenizer = AutoTokenizer.from_pretrained("Qwen/Qwen2.5-7B")

# Note: This model uses a custom pruned architecture
# Custom loading code is required to use this model

Files Included

  • model_weights.pt: Full model state dict
  • README.md: This documentation

License

Apache 2.0 (inherited from base model)

Related Models

  • Qwen/Qwen2.5-7B - Base model
  • Other compression ratios: 50%, 70%, 80%, 90% versions available in this account
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ra225/Qwen2.5-7B-pruned-70p-rmsnorm-finetuned

Base model

Qwen/Qwen2.5-7B
Finetuned
(916)
this model