Qwen 2.5 Coder Instruct - Python Unit Test Fine-tune

This model is a fine-tuned version of Qwen 2.5 Coder Instruct, specifically trained to automate the generation of Python unit tests.

Note: If your specific version of Qwen 2.5 Coder Instruct is a different parameter size (e.g., 1.5B or 32B), make sure to update Qwen/Qwen2.5-Coder-7B-Instruct in the YAML header above with the exact Hugging Face path of the base model you used.

Model Details

Property	Value
Base Model	Qwen 2.5 Coder Instruct
Dataset	KAKA22/CodeRM-UnitTest
Language	Python
Format	16-bit (Safetensors/PyTorch)

Running Locally as a 4-bit Quantized GGUF

Since the default weights are in 16-bit format, you can significantly reduce the memory footprint by converting and quantizing the model to a 4-bit GGUF format using llama.cpp. This makes it much easier to run locally on consumer hardware.

1. Clone and Compile llama.cpp

Clone the repository and build the tools. You will need a C++ compiler and make installed on your system.

git clone https://github.com/ggerganov/llama.cpp
cd llama.cpp
make

Note: If you are using a GPU, you may want to compile with specific flags (e.g., make GGML_CUDA=1 for NVIDIA GPUs).

Next, install the Python dependencies required for the conversion script:

pip install -r requirements.txt

2. Download the 16-bit Model

Download the files from this Hugging Face repository to a local folder using huggingface-cli. Replace <YOUR_USERNAME>/<YOUR_MODEL_NAME> with your actual Hugging Face repository ID:

huggingface-cli download <YOUR_USERNAME>/<YOUR_MODEL_NAME> --local-dir ../my-16bit-model

3. Convert to GGUF (FP16)

Before quantizing to 4-bit, convert the Hugging Face model format into an unquantized (FP16) GGUF format. Run this from inside the llama.cpp directory:

python convert_hf_to_gguf.py ../my-16bit-model --outfile ../my-16bit-model/model-fp16.gguf

4. Quantize to 4-bit (Q4_K_M)

Use the compiled llama-quantize executable to compress the model to a 4-bit format. The Q4_K_M method provides a great balance between size and quality.

./llama-quantize ../my-16bit-model/model-fp16.gguf ../my-16bit-model/model-q4_k_m.gguf Q4_K_M

You can now use model-q4_k_m.gguf with any standard GGUF runner like Ollama, LM Studio, or the llama.cpp server!

Downloads last month: 32

Safetensors

Model size

8B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for sachithra1234/UNIT-LLM

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-Coder-7B

Finetuned

Qwen/Qwen2.5-Coder-7B-Instruct

Finetuned

(327)

this model

sachithra1234
/

UNIT-LLM

Qwen 2.5 Coder Instruct - Python Unit Test Fine-tune

Model Details

Running Locally as a 4-bit Quantized GGUF

1. Clone and Compile llama.cpp

2. Download the 16-bit Model

3. Convert to GGUF (FP16)

4. Quantize to 4-bit (Q4_K_M)

Model tree for sachithra1234/UNIT-LLM

Dataset used to train sachithra1234/UNIT-LLM