OLMo-2-1B-openai-gsm8k-GGUF

This repository contains GGUF quantized versions of Fu01978/OLMo-2-1B-openai-gsm8k, intended for efficient inference with llama.cpp-compatible runtimes.

What’s in this repo

  • GGUF quantized files for inference
  • No training code
  • No safetensors weights

What’s NOT in this repo

  • Original model
  • Training or fine-tuning scripts

Base Model

These quantizations are derived from:

Fu01978/OLMo-2-1B-openai-gsm8k
👉 https://huggingface.co/Fu01978/OLMo-2-1B-openai-gsm8k

Please refer to the base model card for:

  • Training data
  • Intended use
  • Limitations

Usage

Example with llama.cpp:

./main \
  -m OLMo-2-1B-openai-gsm8k*.gguf \
  -p "Solve: 23 + 19 ="

Notes on Quantization

  • Quantization may slightly reduce accuracy
  • Smaller sizes offer faster inference and lower VRAM usage
Downloads last month
41
GGUF
Model size
1B params
Architecture
olmo2
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

6-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Fu01978/OLMo-2-1B-openai-gsm8k-GGUF

Space using Fu01978/OLMo-2-1B-openai-gsm8k-GGUF 1

Collection including Fu01978/OLMo-2-1B-openai-gsm8k-GGUF