Qwen2.5-7B-Instruct OOM (Q8)

Qwen2.5-7B-Instruct converted to OomLlama's .oom format with Q8 quantization

Model Details

Property	Value
Base Model	Qwen/Qwen2.5-7B-Instruct
Format	.oom (OomLlama Model)
Quantization	Q8 (8-bit, 256-block)
Source Precision	bf16 (SafeTensors)
File Size	7.5 GB
Tensors	339
Parameters	7.6B

Architecture

Component	Value
Hidden Size	3584
Layers	28
Q-Heads	28
KV-Heads	4
Head Dim	128
Intermediate	18944
Vocab	152,064
RoPE	Interleaved, theta=1,000,000

Conversion

Converted directly from bf16 SafeTensors using OomLlama's safetensors2oom converter. This performs a single quantization step (bf16 -> Q8), preserving maximum accuracy.

Weights: Q8 quantized (256 values per block, with per-block scale + min)
Norms/Biases: Stored as F32 (lossless)

Verified Output

Tested with OomLlama inference engine (Rust, CPU mode):

Usage

Install OomLlama

pip install oomllama

Download and Run

from huggingface_hub import hf_hub_download

# Download the .oom file
model_path = hf_hub_download(
    repo_id="jaspervandemeent/Qwen2.5-7B-Instruct-OOM",
    filename="qwen2.5-7b-instruct-q8.oom"
)

# Run with OomLlama
from oomllama import OomLlama
llm = OomLlama(model_path)
response = llm.generate("What is the meaning of life?")
print(response)

Rust CLI

# Download
wget https://brein.jaspervandemeent.nl/downloads/oomllama-linux-x86_64
chmod +x oomllama-linux-x86_64

# Run inference
./oomllama-linux-x86_64 --model qwen2.5-7b-instruct-q8.oom --prompt "Hello!"

Convert Your Own

Convert any HuggingFace model to .oom format:

pip install oomllama
safetensors2oom Qwen/Qwen2.5-7B-Instruct output.oom

The .oom Format

Header:  "OOML" (4 bytes) + version (u32) + num_tensors (u32)
Tensor:  name_len (u32) + name + quant_type (u8) + num_blocks (u32) + total_values (u32)
Block:   scale (f32) + min (f32) + data_len (u32) + quantized_bytes (256)

Dequantization: value = byte * scale + min

Credits

OomLlama Engine: Root AI & Jasper (Humotica AI Lab)
Base Model: Alibaba Cloud (Qwen team)
License: Apache 2.0 (following base model license)

Built by Humotica AI Lab - Jasper, Claude, Gemini

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for jaspervandemeent/Qwen2.5-7B-Instruct-OOM

Base model

Qwen/Qwen2.5-7B

Finetuned

Qwen/Qwen2.5-7B-Instruct