Qwen 2.5 Coder Instruct - Python Unit Test Fine-tune

This model is a fine-tuned version of Qwen 2.5 Coder Instruct, specifically trained to automate the generation of Python unit tests.

Note: If your specific version of Qwen 2.5 Coder Instruct is a different parameter size (e.g., 1.5B or 32B), make sure to update Qwen/Qwen2.5-Coder-7B-Instruct in the YAML header above with the exact Hugging Face path of the base model you used.

Model Details

Property Value
Base Model Qwen 2.5 Coder Instruct
Dataset KAKA22/CodeRM-UnitTest
Language Python
Format 16-bit (Safetensors/PyTorch)

Running Locally as a 4-bit Quantized GGUF

Since the default weights are in 16-bit format, you can significantly reduce the memory footprint by converting and quantizing the model to a 4-bit GGUF format using llama.cpp. This makes it much easier to run locally on consumer hardware.

1. Clone and Compile llama.cpp

Clone the repository and build the tools. You will need a C++ compiler and make installed on your system.

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

Note: If you are using a GPU, you may want to compile with specific flags (e.g., make GGML_CUDA=1 for NVIDIA GPUs).

Next, install the Python dependencies required for the conversion script:

pip install -r requirements.txt

2. Download the 16-bit Model

Download the files from this Hugging Face repository to a local folder using huggingface-cli. Replace <YOUR_USERNAME>/<YOUR_MODEL_NAME> with your actual Hugging Face repository ID:

huggingface-cli download <YOUR_USERNAME>/<YOUR_MODEL_NAME> --local-dir ../my-16bit-model

3. Convert to GGUF (FP16)

Before quantizing to 4-bit, convert the Hugging Face model format into an unquantized (FP16) GGUF format. Run this from inside the llama.cpp directory:

python convert_hf_to_gguf.py ../my-16bit-model --outfile ../my-16bit-model/model-fp16.gguf

4. Quantize to 4-bit (Q4_K_M)

Use the compiled llama-quantize executable to compress the model to a 4-bit format. The Q4_K_M method provides a great balance between size and quality.

./llama-quantize ../my-16bit-model/model-fp16.gguf ../my-16bit-model/model-q4_k_m.gguf Q4_K_M

You can now use model-q4_k_m.gguf with any standard GGUF runner like Ollama, LM Studio, or the llama.cpp server!

Downloads last month
32
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sachithra1234/UNIT-LLM

Base model

Qwen/Qwen2.5-7B
Finetuned
(327)
this model

Dataset used to train sachithra1234/UNIT-LLM