LFM2.5-1.2B-Instruct Q4_K_M β€” Big-Endian

Big-endian GGUF conversion of LiquidAI/LFM2.5-1.2B-Instruct for IBM AIX and other big-endian POWER systems.

Why Big-Endian?

The GGUF format stores all weights and metadata in little-endian byte order. Loading a standard GGUF file on a big-endian system (AIX, z/OS, etc.) produces garbage β€” every number is byte-reversed. llama.cpp does not perform runtime byte swapping; it detects the mismatch and fails:

failed to load model: this GGUF file version is extremely large,
is there a mismatch between the host and model endianness?

This pre-converted model works directly on big-endian systems without any additional conversion step.

Model Details

Field Value
Base model LiquidAI/LFM2.5-1.2B-Instruct
Architecture lfm2 hybrid (10 shortconv + 6 GQA attention layers)
Parameters 1.17B
Quantization Q4_K_M
Context window 128,000 tokens
File size 695 MB
Endianness Big-endian
Source GGUF LiquidAI/LFM2.5-1.2B-Instruct-GGUF

Performance on IBM POWER9 (AIX 7.3)

Tested on IBM Power System S924 (POWER9 @ 2.75 GHz), SMT-2 mode:

Threads Generation (tok/s)
1 3.15
4 10.52
8 18.28
16 26.9

Memory usage: ~744 MB (model + compute buffers at 16 threads).

Quick Start (AIX)

# Clone and build llama.cpp for AIX
git clone https://gitlab.com/librepower/llama-aix.git
cd llama-aix
./scripts/fetch_upstream.sh
./scripts/build_aix_73.sh

# Download this model
wget https://huggingface.co/librepowerai/LFM2.5-1.2B-Instruct-Q4_K_M-BE/resolve/main/LFM2.5-1.2B-Instruct-Q4_K_M-be.gguf

# Set optimal SMT mode
smtctl -t 2 -w now

# Run inference
export LIBPATH=$PWD/build/bin:$LIBPATH
./build/bin/llama-simple \
    -m LFM2.5-1.2B-Instruct-Q4_K_M-be.gguf \
    -n 256 -t 16 \
    "You are an AIX admin. Analyze this error log entry:"

How This Model Was Converted

# From the original little-endian GGUF
pip install gguf
echo "YES" | python3 -m gguf.scripts.gguf_convert_endian \
    LFM2.5-1.2B-Instruct-Q4_K_M.gguf big

The conversion swaps every tensor, metadata field, and quantization block from little-endian to big-endian. The process takes about 15 seconds on a modern laptop.

Related

License

This model inherits the license from the base model: LiquidAI/LFM2.5-1.2B-Instruct.


LibrePower β€” Unlocking IBM Power Systems through open source.

https://librepower.org | hello@librepower.org

Downloads last month
5
GGUF
Model size
1B params
Architecture
lfm2
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for librepowerai/LFM2.5-1.2B-Instruct-Q4_K_M-BE

Quantized
(43)
this model