Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
Building on HF
116.7
TFLOPS
11
5
8
Ed Addario
PRO
eaddario
Follow
ENOSYS's profile picture
compumetrika's profile picture
Joseph717171's profile picture
87 followers
·
31 following
EAddario
AI & ML interests
Finding ways to optimize LLMs' inference performance in resource-constrained environments (e.g. commodity hardware, desktops, laptops, mobiles, edge devices, etc.)
Recent Activity
replied
to
their
post
about 3 hours ago
Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.5-4B and Qwen/Qwen3.5-9B Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target. Key Advantages: - VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM). - Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs. Full benchmarks (PPL, KLD, ARC, MMLU, etc.) and methodology in the models' cards https://huggingface.co/eaddario/Qwen3.5-4B-GGUF https://huggingface.co/eaddario/Qwen3.5-9B-GGUF
new
activity
about 3 hours ago
eaddario/imatrix-calibration:
Great collection, I'm using it for my little project.
replied
to
their
post
2 days ago
Experimental global target bits‑per‑weight quantization of Qwen/Qwen3.5-4B and Qwen/Qwen3.5-9B Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target. Key Advantages: - VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM). - Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs. Full benchmarks (PPL, KLD, ARC, MMLU, etc.) and methodology in the models' cards https://huggingface.co/eaddario/Qwen3.5-4B-GGUF https://huggingface.co/eaddario/Qwen3.5-9B-GGUF
View all activity
Organizations
eaddario
's activity
All
Models
Datasets
Spaces
Buckets
Papers
Collections
Community
Posts
Upvotes
Likes
Articles
liked
2 models
9 months ago
dphn/Dolphin-Mistral-24B-Venice-Edition
Text Generation
•
24B
•
Updated
Sep 8, 2025
•
84.2k
•
•
483
marcelbinz/Llama-3.1-Centaur-70B
Text Generation
•
71B
•
Updated
Jul 1, 2025
•
754
•
82
liked
5 datasets
11 months ago
nvidia/OpenMathInstruct-2
Viewer
•
Updated
Nov 25, 2024
•
22M
•
22.8k
•
236
Multilingual-Multimodal-NLP/McEval-Instruct
Viewer
•
Updated
Jun 12, 2024
•
35.9k
•
92
•
37
ise-uiuc/Magicoder-Evol-Instruct-110K
Viewer
•
Updated
Dec 28, 2023
•
111k
•
5.46k
•
174
OpenCoder-LLM/opc-sft-stage2
Viewer
•
Updated
Nov 24, 2024
•
436k
•
1.05k
•
103
Vezora/Open-Critic-GPT
Viewer
•
Updated
Jul 28, 2024
•
55.1k
•
24
•
96
liked
a model
about 1 year ago
MadeAgents/Hammer2.1-7b
Updated
Jun 12, 2025
•
268
•
32