Mac M3 Ultra 256GB
With the new update, is the documentation accurate? Question at bottom
https://unsloth.ai/docs/models/qwen3.5#qwen3.5-397b-a17b
"
Qwen3.5-397B-A17B
Qwen3.5-397B-A17B is in the same performance tier as Gemini 3 Pro, Claude Opus 4.5, and GPT-5.2. The full 397B checkpoint is ~807GB on disk, but via Unsloth's 397B GGUFs you can run:
• 3-bit: fits on 192GB RAM systems (e.g., a 192GB Mac)
• 4-bit (MXFP4): fits on 256GB RAM. Unsloth 4-bit dynamic UD-Q4_K_XL is ~214GB on disk - loads directly on a 256GB M3 Ultra
• Runs on a single 24GB GPU + 256GB system RAM via MoE offloading, reaching 25+ tokens/s
• 8-bit needs ~512GB RAM/VRAM
"
Which is the ideal model to load into a M3 Ultra 256GB 60Core? Simply whichever fits?
Additionally, are there specific models that work best on this model Mac Studio or it doesn't matter too? i.e. use q8 vs q8_X_X for the other updated models?
Thank you again for the uploads and updated!
I recommend trying UD-Q3_K_XL. Regarding the storage, just using an SSD is fine.