Gemma-2B Recipes (GGUF Q4_K_M)
Quantized GGUF version of google/gemma-2b fine-tuned on recipe data for recipe generation.
Model Details
| Base model | google/gemma-2b |
| LoRA adapter | ClaireLee2429/gemma-2b-recipes-lora |
| Training data | corbt/all-recipes |
| Quantization | Q4_K_M (4-bit, K-means) |
| File size | ~1.5 GB |
| Context length | 8192 tokens |
| Format | GGUF (llama.cpp compatible) |
How It Was Made
- Fine-tuned Gemma-2B on recipe data using LoRA (see recipe-lm)
- Merged LoRA adapter into base model weights
- Converted to GGUF FP16 using
convert_hf_to_gguf.pyfrom llama.cpp - Quantized to Q4_K_M using
llama-quantize
Usage
With llama-cpp-python
from huggingface_hub import hf_hub_download
from llama_cpp import Llama
model_path = hf_hub_download(
repo_id="ClaireLee2429/gemma-2b-recipes-gguf",
filename="model.q4_k_m.gguf",
)
llm = Llama(model_path=model_path, n_threads=8, n_ctx=2048)
output = llm.create_completion(
"Recipe for chocolate chip cookies:\n",
max_tokens=256,
temperature=0.7,
top_p=0.9,
repeat_penalty=1.2,
)
print(output["choices"][0]["text"])
With llama.cpp CLI
./llama-cli -m model.q4_k_m.gguf -p "Recipe for pasta carbonara:" -n 256
Example Output
Recipe for chocolate chip cookies:
- 1/2 cup butter
- 1/3 cup sugar
- 1 egg
- 1/4 teaspoon vanilla
- 2/3 cup white flour
- 1/3 cup all-purpose flour
- 1/8 teaspoon baking soda
- 1/8 teaspoon salt
- 1/2 teaspoon cinnamon
- 1/4 cup chocolate chips
Directions:
- Sift together the flours.
- Add in salt and baking powder and mix.
- Add in vanilla, egg and sugar and mix well.
- Roll out on a lightly floured board and cut into desired shapes
and place on an ungreased cookie sheet.
- Bake at 375 degrees for 10-12 minutes.
Performance
Benchmarked on Apple M-series (Metal) and estimated for CPU-only server:
| Environment | Time to first token | Tokens/sec |
|---|---|---|
| Apple Silicon (Metal) | ~0.1s | ~90 tok/s |
| 8 vCPU server (CPU only) | ~1-2s | ~10-20 tok/s |
Related
- Training pipeline & server: glee2429/recipe-lm
- Web frontend: glee2429/kitchen-genie
- Live demo: ClaireLee2429/recipe-lm-api (HuggingFace Space)
- Downloads last month
- 13
Hardware compatibility
Log In to add your hardware
4-bit
Model tree for ClaireLee2429/gemma-2b-recipes-gguf
Base model
google/gemma-2b