Qwen3.5-27B Opus v2 YaRN 1M - Extended Context Reasoning Model

Base Model: Qwen3.5-27B BF16 with YaRN extension
Architecture: Qwen3.5 MoE
Context: 1M tokens (1,048,576)
Purpose: Long document analysis, book-length reasoning, extended chain-of-thought

YaRN Configuration

This model uses YaRN (Yet another RoPE extensioN) for 1M context:

--ctx-size 1048576 --rope-scaling yarn --yarn-orig-ctx 262144

Recommended Settings

Thinking Mode (Default)

--temp 0.6 --top-p 0.95 --top-k 20 --min-p 0 --reasoning on

Non-Thinking Mode

--temp 0.7 --top-p 0.8 --top-k 20 --min-p 0

Quick Start

llama-server -m Qwen3.5-27B-Opus-v2-YaRN-1M-Q4_K_M.gguf   --ctx-size 1048576 --temp 0.6 --top-p 0.95 --top-k 20 --reasoning on   --rope-scaling yarn --yarn-orig-ctx 262144

Note

For standard 262K context, use the regular Opus v2 repo (smaller, faster).

Downloads last month
803
GGUF
Model size
25B params
Architecture
qwen35
Hardware compatibility
Log In to add your hardware

4-bit

5-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support