phi-4-BonfyreFPQ3
BonfyreFPQ โ BF16 safetensors format. Drop-in replacement, no special loader required.
Usage
Standard BF16 safetensors โ load directly in PyTorch, diffusers, or any HuggingFace-compatible framework:
from safetensors.torch import load_file
weights = load_file("model.safetensors")
Or load with the transformers library as usual.
Compression Method
BonfyreFPQ v9/v10 with Bonfyre Weight Algebra:
- Decompose W = L + R (truncated SVD)
- Prune R with hybrid structure-aware pruning
- Curl + divergence energy correction
- FPQ v9 multi-scale encode (LR + E8 + RVQ + QJL + Ghost)
- Decode back to BF16 safetensors
Quality
Per-weight cosine similarity ~0.9999 at ~4 bits/weight. See verified benchmarks.
Compressing Your Own Models
git clone https://github.com/Nickgonzales76017/bonfyre-oss.git && cd bonfyre-oss/cmd/BonfyreFPQ && make
./bonfyre-fpq algebra-compress input.safetensors output.safetensors --bits 3
Links
- Downloads last month
- 126
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support