Model Card for DeepSeek-R1-Distill-Llama-8B-GPTQ_W8A8_G128

Model details

This model is a quantitative version of DeepSeek-R1-Distill-Llama-8B.

Evaluation Results

Model	GSM8K 5-shot
DeepSeek-R1-Distill-Llama-8B	-
DeepSeek-R1-Distill-Llama-8B-GPTQ_W8A8_G128	-

License

This code repository and the model weights are licensed under the MIT License. DeepSeek-R1 series support commercial use, allow for any modifications and derivative works, including, but not limited to, distillation for training other LLMs. Please note that:

DeepSeek-R1-Distill-Qwen-1.5B, DeepSeek-R1-Distill-Qwen-7B, DeepSeek-R1-Distill-Qwen-14B and DeepSeek-R1-Distill-Qwen-32B are derived from Qwen-2.5 series, which are originally licensed under Apache 2.0 License, and now finetuned with 800k samples curated with DeepSeek-R1.
DeepSeek-R1-Distill-Llama-8B is derived from Llama3.1-8B-Base and is originally licensed under llama3.1 license.
DeepSeek-R1-Distill-Llama-70B is derived from Llama3.3-70B-Instruct and is originally licensed under llama3.3 license.

Citation

@misc{deepseekai2025deepseekr1incentivizingreasoningcapability,
      title={DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning}, 
      author={DeepSeek-AI},
      year={2025},
      eprint={2501.12948},
      archivePrefix={arXiv},
      primaryClass={cs.CL},
      url={https://arxiv.org/abs/2501.12948}, 
}

Contact

If you have any questions, please raise an issue or contact us at Github(Lornatang) or liuchangyu1111@gmail.com.

Downloads last month: 2

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ChangyuLiu/DeepSeek-R1-Distill-Llama-8B-GPTQ_W8A8_G128

Base model

deepseek-ai/DeepSeek-R1-Distill-Llama-8B

Quantized

(190)

this model

Datasets used to train ChangyuLiu/DeepSeek-R1-Distill-Llama-8B-GPTQ_W8A8_G128

Paper for ChangyuLiu/DeepSeek-R1-Distill-Llama-8B-GPTQ_W8A8_G128

DeepSeek-R1: Incentivizing Reasoning Capability in LLMs via Reinforcement Learning

Paper • 2501.12948 • Published Jan 22, 2025 • 448