OLMo-2-1B-openai-gsm8k-GGUF

This repository contains GGUF quantized versions of Fu01978/OLMo-2-1B-openai-gsm8k, intended for efficient inference with llama.cpp-compatible runtimes.

What’s in this repo

GGUF quantized files for inference
No training code
No safetensors weights

What’s NOT in this repo

Original model
Training or fine-tuning scripts

Base Model

These quantizations are derived from:

Fu01978/OLMo-2-1B-openai-gsm8k
👉 https://huggingface.co/Fu01978/OLMo-2-1B-openai-gsm8k

Please refer to the base model card for:

Training data
Intended use
Limitations

Usage

Example with llama.cpp:

./main \
  -m OLMo-2-1B-openai-gsm8k*.gguf \
  -p "Solve: 23 + 19 ="

Notes on Quantization

Quantization may slightly reduce accuracy
Smaller sizes offer faster inference and lower VRAM usage

Downloads last month: 41

GGUF

Model size

1B params

Architecture

olmo2

Hardware compatibility

4-bit

5-bit

6-bit

8-bit

Model tree for Fu01978/OLMo-2-1B-openai-gsm8k-GGUF

Base model

allenai/OLMo-2-0425-1B

Finetuned

allenai/OLMo-2-0425-1B-SFT

Finetuned

allenai/OLMo-2-0425-1B-DPO

Finetuned

allenai/OLMo-2-0425-1B-RLVR1

Finetuned

allenai/OLMo-2-0425-1B-Instruct

Finetuned

Fu01978/OLMo-2-1B-openai-gsm8k

Quantized

(1)

this model

Space using Fu01978/OLMo-2-1B-openai-gsm8k-GGUF 1

Collection including Fu01978/OLMo-2-1B-openai-gsm8k-GGUF

Quantizations

Collection

All GGUF quants that I have made so far, and demos too. • 6 items • Updated Mar 2