On the Limits of Generative Reasoning in LLMs
Collection
14 items • Updated • 1
This model is a depth-pruned version of Qwen2.5-7B-Instruct, obtained via Iterative layer pruning and post-trained using QLoRA with Self-Generated Responses (SGR) on the Dolci dataset.
It was released as part of our study on the limits of layer pruning for generative reasoning.
This checkpoint is intended for research and analysis of pruning and recovery, not as a production model.
On the Limits of Layer Pruning for Generative Reasoning in LLMs
https://arxiv.org/abs/2602.01997
@misc{shrestha2026limitslayerpruninggenerative,
title={On the Limits of Layer Pruning for Generative Reasoning in LLMs},
author={Safal Shrestha and Anubhav Shrestha and Aadim Nepal and Minwu Kim and Keith Ross},
year={2026},
eprint={2602.01997},
archivePrefix={arXiv},
primaryClass={cs.LG},
url={https://arxiv.org/abs/2602.01997},
}