原始模型：https://huggingface.co/SakuraLLM/Sakura-13B-Qwen2beta-v0.9

LLAMA.CPP直接转换，未经测试

GGUF

Model size

14B params

Architecture

qwen2

Hardware compatibility

3-bit

6-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support