Qwen3.5-27B Opus v2 YaRN 1M - Extended Context Reasoning Model

Base Model: Qwen3.5-27B BF16 with YaRN extension
Architecture: Qwen3.5 MoE
Context: 1M tokens (1,048,576)
Purpose: Long document analysis, book-length reasoning, extended chain-of-thought

YaRN Configuration

This model uses YaRN (Yet another RoPE extensioN) for 1M context:

--ctx-size 1048576 --rope-scaling yarn --yarn-orig-ctx 262144

Recommended Settings

Thinking Mode (Default)

--temp 0.6 --top-p 0.95 --top-k 20 --min-p 0 --reasoning on

Non-Thinking Mode

--temp 0.7 --top-p 0.8 --top-k 20 --min-p 0

Quick Start

llama-server -m Qwen3.5-27B-Opus-v2-YaRN-1M-Q4_K_M.gguf   --ctx-size 1048576 --temp 0.6 --top-p 0.95 --top-k 20 --reasoning on   --rope-scaling yarn --yarn-orig-ctx 262144

Note

For standard 262K context, use the regular Opus v2 repo (smaller, faster).

Downloads last month: 803

GGUF

Model size

25B params

Architecture

qwen35

Hardware compatibility

4-bit

5-bit

16-bit

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support