YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

Unlocked version of Qwen/Qwen3.5-397B-A17B

Benchmark Results

Capability Benchmarks (thinking=false)

Benchmark CARVE NVFP4 (es=0.75) Reference NVFP4 (nvidia)
MMLU 54 (temp=0.6) 94.4% (51/54) 88.9% (48/54)
GSM8K 20 (temp=0.6) 95% (19/20) 95% (19/20)
HumanEval 164 (temp=0.2) 90.9% (149/164) not tested

MMLU-Pro Comparison (seed=42, 600 random questions, thinking=true, temp=0.6)

Model MMLU-Pro 5% vs Official 87.8% Time Notes
CARVE NVFP4 (es=0.75) 86.7% (520/600) -1.1pp 46 min 0 errors
Reference NVFP4 (nvidia) 86.7% (520/600) -1.1pp 54 min 0 errors

Per-category comparison (CARVE vs Reference):

Category CARVE Reference Delta
biology 96.4% (27/28) 96.4% (27/28) 0
computer science 94.7% (18/19) 94.7% (18/19) 0
chemistry 93.1% (67/72) 91.7% (66/72) +1.4
math 91.4% (53/58) 91.4% (53/58) 0
health 89.2% (33/37) 91.9% (34/37) -2.7
business 89.2% (33/37) 89.2% (33/37) 0
economics 89.2% (33/37) 86.5% (32/37) +2.7
other 86.0% (37/43) 90.7% (39/43) -4.7
psychology 84.6% (33/39) 87.2% (34/39) -2.6
physics 84.6% (55/65) 78.5% (51/65) +6.1
history 85.0% (17/20) 80.0% (16/20) +5.0
philosophy 84.4% (27/32) 84.4% (27/32) 0
engineering 78.3% (36/46) 82.6% (38/46) -4.3
law 76.1% (51/67) 77.6% (52/67) -1.5
  • Both CARVE and Reference score exactly 86.7% (520/600)
  • Per-category variations are noise (±6pp, evenly distributed)

CARVE MTP=2 Crossover Test

Context No MTP (tok/s) MTP=2 (tok/s) Delta MTP wins?
short (~25 tok) 74 106 +43% YES
10k (9025 tok) 73.7 84.1 +14% YES
20k (18025 tok) 70.2 69.0 -2% ~TIE
50k (46025 tok) 68.9 43.2 -37% NO
100k (93025 tok) 66.7 26.5 -60% NO
151k (prior) 67 19 -72% NO

Crossover point: ~20k tokens input context.

  • Below 20k: MTP=2 wins (+14-43%)
  • At 20k: effectively tied
  • Above 20k: MTP=2 progressively worse, -37% at 50k, -60% at 100k, -72% at 151k
  • MTP acceptance rate degrades with context length for abliterated weights
  • No-MTP stays remarkably flat: 74→67 tok/s across 0-151K (only -9%)
Downloads last month
1,996
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support