Iterative-Qwen-7layers-SGR-Dolci

This model is a depth-pruned version of Qwen2.5-7B-Instruct, obtained via Iterative layer pruning and post-trained using QLoRA with Self-Generated Responses (SGR) on the Dolci dataset.

It was released as part of our study on the limits of layer pruning for generative reasoning.

Summary

Base model: Qwen2.5-7B-Instruct
Pruning: Iterative
Depth: 7 layers
Post-training: QLoRA
Supervision: Self-Generated Responses (SGR)
Data: Dolci (SGR variant)

This checkpoint is intended for research and analysis of pruning and recovery, not as a production model.

Paper

On the Limits of Layer Pruning for Generative Reasoning in LLMs
https://arxiv.org/abs/2602.01997

@misc{shrestha2026limitslayerpruninggenerative,
  title={On the Limits of Layer Pruning for Generative Reasoning in LLMs},
  author={Safal Shrestha and Anubhav Shrestha and Aadim Nepal and Minwu Kim and Keith Ross},
  year={2026},
  eprint={2602.01997},
  archivePrefix={arXiv},
  primaryClass={cs.LG},
  url={https://arxiv.org/abs/2602.01997},
}

Downloads last month: 5

Safetensors

Model size

6B params

Tensor type

BF16

Collection including safal312/Iterative-Qwen-7layers-SGR-Dolci

On the Limits of Generative Reasoning in LLMs

Collection

14 items • Updated Feb 3 • 1

Paper for safal312/Iterative-Qwen-7layers-SGR-Dolci

On the Limits of Layer Pruning for Generative Reasoning in LLMs

Paper • 2602.01997 • Published Feb 2 • 4