In this repo you can find the quantized versions of "BSC-LT/salamandra-2b" ready to use in ollama and R.
Installation
To use it, first download this repo:
$ hf download "jrosell/salamandra-2b-GGUF"
$ cd "$HOME/.cache/huggingface/hub/models--BSC-LT--salamandra-2b/snapshots/$(cat ~/.cache/huggingface/hub/models--BSC-LT--salamandra-2b/refs/main)"
Usage
Use GGUF Q8_0 version in Ollama
$ echo "FROM ./models/salamandra-2b-Q8_0.gguf" > Modelfile.Q8_0
$ ollama create salamandra-2b:Q8_0 -f Modelfile.Q8_0
$ ollama run salamandra-2b:Q8_0
Use the Quantized Q4_K_M version in Ollama:
$ echo "FROM ./models/salamandra-2b-Q4_K_M.gguf" > Modelfile.Q4_K_M
$ ollama create salamandra-2b:Q4_K_M -f Modelfile.Q4_K_M
$ ollama run salamandra-2b:Q4_K_M
Use it in R:
$ R --vanilla -e 'ellmer::chat_ollama(model = "salamandra-2b:Q4_K_M")$chat("Explica un acudit sobre informàtics.")'
How I built this
$ mkdir -p salamandra-2b/models && cd salamandra-2b
$ hf download "BSC-LT/salamandra-2b"
$ git clone https://github.com/ggerganov/llama.cpp.git
$ uv init
$ uv add -r llama.cpp/requirements.txt
$ export MODEL_DIR="$HOME/.cache/huggingface/hub/models--BSC-LT--salamandra-2b/snapshots/$(cat ~/.cache/huggingface/hub/models--BSC-LT--salamandra-2b/refs/main)"
$ uv run llama.cpp/convert_hf_to_gguf.py $MODEL_DIR \
--outfile models/salamandra-2b.gguf \
--outtype auto
$ cd llama.cpp
$ cmake -B build
$ cmake --build build --config Release --target llama-quantize
$ ./build/bin/llama-quantize ../models/salamandra-2b.gguf ../models/salamandra-2b-Q4_K_M.gguf Q4_K_M
- Downloads last month
- 14
Hardware compatibility
Log In to add your hardware
Model tree for jrosell/salamandra-2b-GGUF
Base model
BSC-LT/salamandra-2b