In this repo you can find the quantized versions of "BSC-LT/salamandra-2b" ready to use in ollama and R.

Installation

To use it, first download this repo:

$ hf download "jrosell/salamandra-2b-GGUF"
$ cd "$HOME/.cache/huggingface/hub/models--BSC-LT--salamandra-2b/snapshots/$(cat ~/.cache/huggingface/hub/models--BSC-LT--salamandra-2b/refs/main)"

Usage

Use GGUF Q8_0 version in Ollama

$ echo "FROM ./models/salamandra-2b-Q8_0.gguf" > Modelfile.Q8_0
$ ollama create salamandra-2b:Q8_0 -f Modelfile.Q8_0
$ ollama run salamandra-2b:Q8_0

Use the Quantized Q4_K_M version in Ollama:

$ echo "FROM ./models/salamandra-2b-Q4_K_M.gguf" > Modelfile.Q4_K_M
$ ollama create salamandra-2b:Q4_K_M -f Modelfile.Q4_K_M
$ ollama run salamandra-2b:Q4_K_M

Use it in R:

$ R --vanilla -e 'ellmer::chat_ollama(model = "salamandra-2b:Q4_K_M")$chat("Explica un acudit sobre informàtics.")'

How I built this

$ mkdir -p salamandra-2b/models && cd salamandra-2b
$ hf download "BSC-LT/salamandra-2b"
$ git clone https://github.com/ggerganov/llama.cpp.git
$ uv init
$ uv add -r llama.cpp/requirements.txt
$ export MODEL_DIR="$HOME/.cache/huggingface/hub/models--BSC-LT--salamandra-2b/snapshots/$(cat ~/.cache/huggingface/hub/models--BSC-LT--salamandra-2b/refs/main)"
$ uv run llama.cpp/convert_hf_to_gguf.py $MODEL_DIR \
    --outfile models/salamandra-2b.gguf \
    --outtype auto
$ cd llama.cpp
$ cmake -B build
$ cmake --build build --config Release --target llama-quantize
$ ./build/bin/llama-quantize ../models/salamandra-2b.gguf ../models/salamandra-2b-Q4_K_M.gguf Q4_K_M

Downloads last month: 14

GGUF

Model size

2B params

Architecture

llama

Hardware compatibility

4-bit

8-bit

View +1 variant

Model tree for jrosell/salamandra-2b-GGUF

Base model

BSC-LT/salamandra-2b

Quantized

(6)

this model

jrosell
/

salamandra-2b-GGUF

Installation

Usage

How I built this

Model tree for jrosell/salamandra-2b-GGUF

Datasets used to train jrosell/salamandra-2b-GGUF