GGUF
Collection
11 items โข Updated
This repository contains Llama-3.2-1B-Instruct model, optimized for Renesas X5H platform for text inference.
The following performance metrics were measured with a prompt.
| Model | Precision | Device | Response Rate (tokens/sec) |
|---|---|---|---|
| Llama-3.2-1B-Instruct | F16 | X5H - Single Cluster NPX | 16.7 tokens/sec |
To run model, you need:
Copy the binary and model to one single folder
<PATH_ON_BOARD>
โโโ llama-runner
โโโ Llama-3.2-1B-Instruct-f16.gguf
./llama-runner "prompt"
16-bit
Base model
meta-llama/Llama-3.2-1B-Instruct