Qwen 2.5 Coder Instruct - Python Unit Test Fine-tune
This model is a fine-tuned version of Qwen 2.5 Coder Instruct, specifically trained to automate the generation of Python unit tests.
Note: If your specific version of Qwen 2.5 Coder Instruct is a different parameter size (e.g., 1.5B or 32B), make sure to update
Qwen/Qwen2.5-Coder-7B-Instructin the YAML header above with the exact Hugging Face path of the base model you used.
Model Details
| Property | Value |
|---|---|
| Base Model | Qwen 2.5 Coder Instruct |
| Dataset | KAKA22/CodeRM-UnitTest |
| Language | Python |
| Format | 16-bit (Safetensors/PyTorch) |
Running Locally as a 4-bit Quantized GGUF
Since the default weights are in 16-bit format, you can significantly reduce the memory footprint by converting and quantizing the model to a 4-bit GGUF format using llama.cpp. This makes it much easier to run locally on consumer hardware.
1. Clone and Compile llama.cpp
Clone the repository and build the tools. You will need a C++ compiler and make installed on your system.
git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make
Note: If you are using a GPU, you may want to compile with specific flags (e.g.,
make GGML_CUDA=1for NVIDIA GPUs).
Next, install the Python dependencies required for the conversion script:
pip install -r requirements.txt
2. Download the 16-bit Model
Download the files from this Hugging Face repository to a local folder using huggingface-cli. Replace <YOUR_USERNAME>/<YOUR_MODEL_NAME> with your actual Hugging Face repository ID:
huggingface-cli download <YOUR_USERNAME>/<YOUR_MODEL_NAME> --local-dir ../my-16bit-model
3. Convert to GGUF (FP16)
Before quantizing to 4-bit, convert the Hugging Face model format into an unquantized (FP16) GGUF format. Run this from inside the llama.cpp directory:
python convert_hf_to_gguf.py ../my-16bit-model --outfile ../my-16bit-model/model-fp16.gguf
4. Quantize to 4-bit (Q4_K_M)
Use the compiled llama-quantize executable to compress the model to a 4-bit format. The Q4_K_M method provides a great balance between size and quality.
./llama-quantize ../my-16bit-model/model-fp16.gguf ../my-16bit-model/model-q4_k_m.gguf Q4_K_M
You can now use model-q4_k_m.gguf with any standard GGUF runner like Ollama, LM Studio, or the llama.cpp server!
- Downloads last month
- 32