vpyn
/

Qwen3.5-397B-A17B-CARVE-v1-NVFP4

Model card Files Files and versions

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Unlocked version of Qwen/Qwen3.5-397B-A17B

Benchmark Results

Capability Benchmarks (thinking=false)

Benchmark	CARVE NVFP4 (es=0.75)	Reference NVFP4 (nvidia)
MMLU 54 (temp=0.6)	94.4% (51/54)	88.9% (48/54)
GSM8K 20 (temp=0.6)	95% (19/20)	95% (19/20)
HumanEval 164 (temp=0.2)	90.9% (149/164)	not tested

MMLU-Pro Comparison (seed=42, 600 random questions, thinking=true, temp=0.6)

Model	MMLU-Pro 5%	vs Official 87.8%	Time	Notes
CARVE NVFP4 (es=0.75)	86.7% (520/600)	-1.1pp	46 min	0 errors
Reference NVFP4 (nvidia)	86.7% (520/600)	-1.1pp	54 min	0 errors

Per-category comparison (CARVE vs Reference):

Category	CARVE	Reference	Delta
biology	96.4% (27/28)	96.4% (27/28)	0
computer science	94.7% (18/19)	94.7% (18/19)	0
chemistry	93.1% (67/72)	91.7% (66/72)	+1.4
math	91.4% (53/58)	91.4% (53/58)	0
health	89.2% (33/37)	91.9% (34/37)	-2.7
business	89.2% (33/37)	89.2% (33/37)	0
economics	89.2% (33/37)	86.5% (32/37)	+2.7
other	86.0% (37/43)	90.7% (39/43)	-4.7
psychology	84.6% (33/39)	87.2% (34/39)	-2.6
physics	84.6% (55/65)	78.5% (51/65)	+6.1
history	85.0% (17/20)	80.0% (16/20)	+5.0
philosophy	84.4% (27/32)	84.4% (27/32)	0
engineering	78.3% (36/46)	82.6% (38/46)	-4.3
law	76.1% (51/67)	77.6% (52/67)	-1.5

Both CARVE and Reference score exactly 86.7% (520/600)
Per-category variations are noise (±6pp, evenly distributed)

CARVE MTP=2 Crossover Test

Context	No MTP (tok/s)	MTP=2 (tok/s)	Delta	MTP wins?
short (~25 tok)	74	106	+43%	YES
10k (9025 tok)	73.7	84.1	+14%	YES
20k (18025 tok)	70.2	69.0	-2%	~TIE
50k (46025 tok)	68.9	43.2	-37%	NO
100k (93025 tok)	66.7	26.5	-60%	NO
151k (prior)	67	19	-72%	NO

Crossover point: ~20k tokens input context.

Below 20k: MTP=2 wins (+14-43%)
At 20k: effectively tied
Above 20k: MTP=2 progressively worse, -37% at 50k, -60% at 100k, -72% at 151k
MTP acceptance rate degrades with context length for abliterated weights
No-MTP stays remarkably flat: 74→67 tok/s across 0-151K (only -9%)

Downloads last month: 1,996

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support