Instructions to use samscrack/Qwen3.6-Solidity-27B with libraries, inference providers, notebooks, and local apps. Follow these links to get started.

Libraries

How to use samscrack/Qwen3.6-Solidity-27B with Transformers:

# Use a pipeline as a high-level helper
from transformers import pipeline

pipe = pipeline("text-generation", model="samscrack/Qwen3.6-Solidity-27B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
pipe(messages)

# Load model directly
from transformers import AutoTokenizer, AutoModelForCausalLM

tokenizer = AutoTokenizer.from_pretrained("samscrack/Qwen3.6-Solidity-27B")
model = AutoModelForCausalLM.from_pretrained("samscrack/Qwen3.6-Solidity-27B")
messages = [
    {"role": "user", "content": "Who are you?"},
]
inputs = tokenizer.apply_chat_template(
	messages,
	add_generation_prompt=True,
	tokenize=True,
	return_dict=True,
	return_tensors="pt",
).to(model.device)

outputs = model.generate(**inputs, max_new_tokens=40)
print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:]))

Notebooks
Google Colab
Kaggle
Local Apps

vLLM

How to use samscrack/Qwen3.6-Solidity-27B with vLLM:

Install from pip and serve model

# Install vLLM from pip:
pip install vllm
# Start the vLLM server:
vllm serve "samscrack/Qwen3.6-Solidity-27B"
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:8000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "samscrack/Qwen3.6-Solidity-27B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker

docker model run hf.co/samscrack/Qwen3.6-Solidity-27B

SGLang

How to use samscrack/Qwen3.6-Solidity-27B with SGLang:

Install from pip and serve model

# Install SGLang from pip:
pip install sglang
# Start the SGLang server:
python3 -m sglang.launch_server \
    --model-path "samscrack/Qwen3.6-Solidity-27B" \
    --host 0.0.0.0 \
    --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "samscrack/Qwen3.6-Solidity-27B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Use Docker images

docker run --gpus all \
    --shm-size 32g \
    -p 30000:30000 \
    -v ~/.cache/huggingface:/root/.cache/huggingface \
    --env "HF_TOKEN=<secret>" \
    --ipc=host \
    lmsysorg/sglang:latest \
    python3 -m sglang.launch_server \
        --model-path "samscrack/Qwen3.6-Solidity-27B" \
        --host 0.0.0.0 \
        --port 30000
# Call the server using curl (OpenAI-compatible API):
curl -X POST "http://localhost:30000/v1/chat/completions" \
	-H "Content-Type: application/json" \
	--data '{
		"model": "samscrack/Qwen3.6-Solidity-27B",
		"messages": [
			{
				"role": "user",
				"content": "What is the capital of France?"
			}
		]
	}'

Docker Model Runner
How to use samscrack/Qwen3.6-Solidity-27B with Docker Model Runner:
```
docker model run hf.co/samscrack/Qwen3.6-Solidity-27B
```

Qwen3.6-Solidity-27B / README.md

samscrack

docs: add Solidity Eval 2026 pass@1 leaderboard at top — 46.5% beats Claude Opus 4.7 by +7.5pp

9dc92d9 verified 2 days ago

preview code

raw

history blame contribute delete

8.3 kB

	---
	license: apache-2.0
	base_model: Qwen/Qwen3.6-27B
	language:
	- en
	library_name: transformers
	tags:
	- solidity
	- smart-contracts
	- code-generation
	- foundry
	- blockchain
	- ethereum
	- security-audit
	- rejection-fine-tuning
	- qwen
	datasets:
	- ASSERT-KTH/DISL
	- braindao/solidity-base-sft-v2
	- samscrack/solidity-audit-cot
	pipeline_tag: text-generation
	---

	# Qwen 3.6 Solidity (27B)

	A 5-stage Solidity-specialist fine-tune of `Qwen/Qwen3.6-27B`. Trained to produce
	Foundry-compileable Solidity contracts and matching test suites from natural-
	language specs, and to reason about smart-contract security with long-CoT audit
	traces.

	This is the final merged checkpoint — all five stages (CPT → SFT instruction
	→ SFT audit/CoT → SFT Opus distillation → RFT) folded into a single bf16 model.
	Loadable directly with `AutoModelForCausalLM.from_pretrained(...)` — no adapters
	to apply.

	## Solidity Eval (2026) — pass@1 leaderboard

	Top of the pass@1 leaderboard on [`samscrack/solidity-eval-2026`](https://huggingface.co/datasets/samscrack/solidity-eval-2026) (`lite` split, 200 real Etherscan contracts):

	\| Agent / model \| pass@1 \| Wall-clock \|
	\|---\|---\|---\|
	\| This model — Qwen 3.6 Solidity 27B \| 46.5% (93/200) \| ~27 min \|
	\| Claude Code 2.1.128 (Claude Opus 4.7) \| 39.0% (78/200, 1 timeout) \| ~34 min \|

	`pass@1` here is SolBench's `echidna()` rule: a single agentic attempt is scored 1.0 only if Diffusc compiles the candidate AND Echidna's differential-fuzz finds no behavioral divergence vs. the ground-truth body, with B3 canary + stub-residue guards. No resampling. Identical conditions across rows: 16-way concurrency, `max_agent_turns=40`, `agent_temperature=0.6`, `fuzz_test_calls=50000`, `fuzz_seed=0xDEADBEEF`, same sandbox image, same host. This model served locally via vLLM TP=2 FP8 (qwen3_xml tool parser) on 2× Blackwell GPUs through the in-process Hermes agent loop; Claude Code via Anthropic API through the CLI agent backend.

	See the dataset card for the full reproduction recipe and harness-agnostic scoring instructions.


	## Pipeline

	\| # \| Stage \| Method \| Adapter \| Training data \|
	\|---\|---\|---\|---\|---\|
	\| 0 \| Continued pretrain \| LoRA r=64, ~500M Solidity tokens \| folded in \| `ASSERT-KTH/DISL` (514k deployed contracts, CC-BY 4.0) + ~80 curated blue-chip GitHub repos \|
	\| 1B \| Instruction SFT \| LoRA r=64, 178 steps \| folded in \| `final.jsonl` (~315k rows: braindao/solidity-base-sft-v2 + andstor/smart_contract_code_comments + lohoz/Smart-Contract-MultiTask + slither-audited + Pyano-fun) + 4,240 unverified `foundry_tests.jsonl` rows \|
	\| 2 \| Audit / long-CoT \| LoRA r=16, 2 epochs \| folded in \| `samscrack/solidity-audit-cot` (~6,140 Opus 4.7 long-form audit traces, all `confidence=high`, ≤30k chars to fit 8K ctx) \|
	\| 3 \| Opus distillation SFT \| LoRA r=16, 2 epochs, lr=5e-5 \| folded in \| 4,000 of 4,919 forge-verified Opus pairs (`foundry_tests.verified.jsonl`); 919 held out from training \|
	\| 4 \| Rejection fine-tuning (RFT) \| LoRA r=16, 2 epochs, lr=5e-5 \| folded in (this checkpoint) \| 926 model-generated contract+test pairs that passed `forge build && forge test` self-oracle, with non-triviality gate (≥3 test fns, ≥2 distinct asserts) \|

	Stages 0/1B/2 were the original recipe (specification + Opus-CoT distillation).
	Stages 3/4 are the addition: directly distill the highest-quality forge-verified
	Opus pairs (Stage 3), then rejection-sample the model's own forge-passing outputs
	to anchor self-consistent generation (Stage 4).

	## Eval — Stage 3 → Stage 4 (RFT) comparison

	200 prompts × N=4 candidates from a held-out slice (never trained on at any
	stage). Each model-generated `(contract, test_file)` pair is dropped into a
	fresh Foundry project and scored end-to-end with `forge build && forge test`:

	\| Metric (200 prompts × N=4 candidates) \| Post-Stage-3 \| Post-Stage-4 \| Δ \|
	\|---\|---\|---\|---\|
	\| extract success \| 80.5% \| 86.4% \| +5.9 pp \|
	\| compile success \| 46.8% \| 50.6% \| +3.8 pp \|
	\| test pass \| 19.2% \| 21.4% \| +2.2 pp \|
	\| prompts ≥1 pass \| 45.0% \| 54.0% \| +9.0 pp \|

	Stage 4 RFT lifted prompt-level yield by +9 percentage points (45 → 54 %).
	Per-candidate compile rate jumped 10× across the full pipeline (4.5 % pre-Stage-3
	→ 50.6 % post-Stage-4) — the model now produces Foundry-compileable contracts
	with matching test suites at >50 % per individual candidate.

	## What this model is good at

	- Producing self-consistent Foundry-compileable contract + test pairs from a NL spec.
	Self-oracle test pass rate is 21.4% per candidate, 54% of prompts have ≥1 of 4 passes.
	- Long-CoT audit reasoning. Stage 2 was trained on ~6k Opus 4.7 audit traces with
	reasoning steps + structured findings (severity / category / location / impact / fix).
	- Solidity-idiomatic generation. Stage 0 CPT shifts the base distribution toward
	modern Solidity patterns (`mapping`, `msg.sender`, `pragma`, custom errors, etc.).

	## Limitations

	- Synthetic-data lineage. Stage 1B includes braindao/solidity-base-sft-v2
	whose teacher model is undisclosed (likely commodity GPT, not GPT-4-class).
	Quality ceiling is bounded by the teacher.
	- Audit-corpus legality. Stage 2 corpus (`samscrack/solidity-audit-cot`) is
	Opus-generated under Anthropic API terms over braindao seed contracts. Legal
	review recommended before any commercial use of the audit-finding outputs.
	- Held-out eval. This model has never seen `samscrack/solidity-eval-2026`
	(SolBench RACR-4k + differential fuzz) at any stage — that's the gold benchmark.

	## Usage

	```python
	from transformers import AutoModelForCausalLM, AutoTokenizer
	import torch

	model = AutoModelForCausalLM.from_pretrained(
	"samscrack/Qwen3.6-Solidity-27B",
	torch_dtype=torch.bfloat16,
	device_map="auto",
	trust_remote_code=True,
	)
	tok = AutoTokenizer.from_pretrained("samscrack/Qwen3.6-Solidity-27B")

	# Spec → contract + tests
	spec = (
	"Implement a Solidity contract that holds a mapping from address to uint256 "
	"balance. Owner can mint to any address. Anyone can transfer their balance to "
	"another address. Include a Foundry test suite covering happy paths and the "
	"owner-only invariant.\n\nProduce both the Solidity contract and a Foundry "
	"test suite that exercises it."
	)
	msgs = [{"role": "user", "content": spec}]
	inputs = tok.apply_chat_template(
	msgs, tokenize=False, add_generation_prompt=True, enable_thinking=True,
	)
	toks = tok(inputs, return_tensors="pt").to(model.device)
	out = model.generate(
	**toks, max_new_tokens=4096, temperature=0.7, top_p=0.9, do_sample=True,
	)
	print(tok.decode(out[0][toks.input_ids.shape[-1]:], skip_special_tokens=True))
	```

	The generated assistant turn has the shape:
	```
	<think>...short design rationale...</think>
	```solidity
	// SPDX-License-Identifier: MIT
	pragma solidity ^0.8.x;
	contract MyContract { ... }
	```

	```solidity
	// test/Contract.t.sol
	import "forge-std/Test.sol";
	import "../src/Contract.sol";
	contract MyContractTest is Test { ... }
	```
	```

	## Format envelope

	The model was trained on the canonical `<think>...</think>\n```solidity\n{contract}\n```\n\n```solidity\n// test/Contract.t.sol\n{tests}\n``` ` envelope. Most reliable
	reproduction is to ask the user prompt to end with: *"Produce both the Solidity
	contract and a Foundry test suite that exercises it."*

	## Training infrastructure

	- 2× NVIDIA RTX PRO 6000 Blackwell Workstation (96 GB each)
	- Trainer: TRL 0.22 + Unsloth 2026.4.7 + PyTorch 2.8.0 + cu128
	- Inference (sampling for Stage 4 RFT): vLLM 0.19.1 with FP8 dynamic quant +
	FLASH_ATTN backend + Qwen3 reasoning parser

	## Citation

	```
	@misc{qwen3.6-solidity-27b,
	author = {Sam Crack (samscrack)},
	title = {Qwen 3.6 Solidity (27B): a 5-stage CPT/SFT/RFT recipe for
	Foundry-compileable Solidity codegen},
	year = {2026},
	publisher = {HuggingFace},
	url = {https://huggingface.co/samscrack/Qwen3.6-Solidity-27B}
	}
	```

	## License

	Apache-2.0 (this checkpoint). Underlying training data is from CC-BY/MIT-tier
	sources; teacher reasoning content (Stage 2 + Stage 3) was generated under
	Anthropic API terms of use as of generation date (2026-05-04). Eval set
	`samscrack/solidity-eval-2026` is NOT used at any training stage.