Iterative-Qwen-7layers-SGR-Dolci

This model is a depth-pruned version of Qwen2.5-7B-Instruct, obtained via Iterative layer pruning and post-trained using QLoRA with Self-Generated Responses (SGR) on the Dolci dataset.

It was released as part of our study on the limits of layer pruning for generative reasoning.

Summary

  • Base model: Qwen2.5-7B-Instruct
  • Pruning: Iterative
  • Depth: 7 layers
  • Post-training: QLoRA
  • Supervision: Self-Generated Responses (SGR)
  • Data: Dolci (SGR variant)

This checkpoint is intended for research and analysis of pruning and recovery, not as a production model.

Paper

On the Limits of Layer Pruning for Generative Reasoning in LLMs
https://arxiv.org/abs/2602.01997

@misc{shrestha2026limitslayerpruninggenerative,
  title={On the Limits of Layer Pruning for Generative Reasoning in LLMs},
  author={Safal Shrestha and Anubhav Shrestha and Aadim Nepal and Minwu Kim and Keith Ross},
  year={2026},
  eprint={2602.01997},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2602.01997},
}
Downloads last month
5
Safetensors
Model size
6B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Collection including safal312/Iterative-Qwen-7layers-SGR-Dolci

Paper for safal312/Iterative-Qwen-7layers-SGR-Dolci