Commit History

README: document -DGGML_SCHED_MAX_SPLIT_INPUTS=128 flag for multi-GPU CUDA builds
3efdad2
verified

cchuter commited on

README: CUDA support landed (19/19 on RTX 5090), branch renamed to feat/v4-port-cuda
ea169c1
verified

cchuter commited on

README: all IQ-class quants now ship with chat_template baked in (--chat-template-file no longer required)
49308dc
verified

cchuter commited on

IQ2_XS-XL shard 1: embed chat_template (dsml.jinja)
9ab762c
verified

cchuter commited on

IQ2_XXS-XL shard 1: embed chat_template (dsml.jinja)
abd9404
verified

cchuter commited on

IQ1_M-XL shard 1: embed chat_template (dsml.jinja)
868488d
verified

cchuter commited on

IQ1_M shard 1: embed chat_template (dsml.jinja)
4176535
verified

cchuter commited on

IQ1_S-XL shard 1: embed chat_template (dsml.jinja) so --chat-template-file flag is no longer required at runtime
71c338a
verified

cchuter commited on

README: explicit CUDA / non-Metal backends not supported (Metal-only ops)
8224b05
verified

cchuter commited on

README: all quants now pass gate-tools (Q8/Q4/Q2 ship template baked in; IQ-class needs --chat-template-file imatrix/dsml.jinja)
ed189bf
verified

cchuter commited on

Add DSML chat template (use with --chat-template-file for IQ1/IQ2 quants)
3a2c507
verified

cchuter commited on

Bake DSML chat_template into Q2_K-XL shard 1 (fixes tool-call format)
a192557
verified

cchuter commited on

Bake DSML chat_template into Q4_K_M-XL shard 1 (fixes tool-call format)
a6cad71
verified

cchuter commited on

README: tone down tool-calling claims to 'unverified', add Q4_K_M-XL row + recommendation
b993d14
verified

cchuter commited on

Add Q4_K_M-XL shards (4.92 BPW, 23.28 t/s, gate-tools needs further investigation)
ad133b9
verified

cchuter commited on

Update README: add Q2_K-XL/IQ1/IQ2 rows, document tool-calling status, recommend Q2_K-XL for non-tool use
12b0841
verified

cchuter commited on

Add wikitext-103 imatrix calibration (1000 chunks, V4 Flash)
50ea77c
verified

cchuter commited on

Add IQ2_XS-XL shards (failed gate-tools — see README)
4b2bcdc
verified

cchuter commited on

Add IQ2_XXS-XL shards (failed gate-tools — see README)
2867d35
verified

cchuter commited on

Add IQ1_M-XL shards (failed gate-tools — see README)
a197fa6
verified

cchuter commited on

Add IQ1_M shards (failed gate-tools — see README)
6025005
verified

cchuter commited on

Add IQ1_S-XL shards (failed gate-tools — see README)
8900d49
verified

cchuter commited on

Add Q2_K-XL shards (currently recommended for V4 agent use)
5354105
verified

cchuter commited on

Add Q8_0 shards (7 × ~45 GiB)
e769bc6
verified

cchuter commited on

Add model card
eb60abf
verified

cchuter commited on

initial commit
5d31a3e
verified

cchuter commited on