Qwopus3.6-27B-solidity-cpt-stageA

⚠️ Intermediate checkpoint — Stage 0 of 5. This is a continued-pretraining (CPT) LoRA adapter, the first stage of a multi-stage Solidity specialization pipeline. Not intended for direct production use. It learns Solidity-language priors but has not yet been instruction-tuned. Use the downstream stage outputs for actual deployment.

A LoRA adapter on top of Qwopus3.6-27B-stage1-merged (a 27 B Qwen3.6 fine-tune) that biases the model's distribution toward modern Solidity smart contracts.

Pipeline context

Stage	Adapter	Status	Output
Stage 0 — CPT (this card)	plain LoRA (r=64, α=64)	✅ complete	LoRA adapter, biases LM toward Solidity priors
Stage 1 — Instruction SFT	plain LoRA (r=64, α=64)	✅ complete → `Qwopus3.6-27B-solidity-sft-stage1B`	spec → contract + Foundry tests capability
Stage 2 — Audit / reasoning SFT	plain LoRA (r=16, α=16)	✅ complete → `Qwopus3.6-27B-solidity-audit-stage2`	vulnerability detection + Long-CoT audit reasoning, trained on `solidity-audit-cot` (6,140 Opus 4.7 traces fit at 8K ctx)
Stage 3 — RFT (rejection-sampling FT)	plain LoRA (r=64, α=64)	⬜ pending	functional-correctness boost via Foundry test execution
Stage 4 — GSPO (RL)	plain LoRA (r=64, α=64)	⬜ optional	last-mile pass-rate improvement

ℹ️ None of the stages use DoRA. DoRA was empirically benchmarked at Stage 0 and dropped — use_dora=True measured ~280 s/step on this stack vs ~180 s for plain LoRA (≈30 % slower) for sub-1pt expected quality gain at this scale. The legacy directory name qwen_dora_cpt/ and some internal references are historical artefacts only — the saved weights are plain LoRA. DoRA may be revisited at Stage 4 (GSPO) where step time matters less.

This adapter is the foundation; subsequent stages stack on top after merge.

Model details

Base model: Qwopus3.6-27B-stage1-merged (51 GB bf16, derived from Qwen/Qwen3.6-27B)
Adapter type: plain LoRA (PEFT). Not DoRA — use_dora=False. See DoRA decision in Limitations for the benchmark that drove this choice.
LoRA configuration:
- rank: 64
- alpha: 64
- target modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj, out_proj
- dropout: 0.0
- bias: none
Trainable parameters: 353,370,112 (~2.34% of effective 15B-equivalent param count)
Quantization: base loaded in 4-bit (BnB NF4) for QLoRA-style training; adapter weights are bf16

Training data

A composite Solidity corpus, quality-filtered from public sources:

Source	License	Approx. tokens (post-filter)
ASSERT-KTH/DISL (decomposed split)	CC BY 4.0	~76M (top-10% by composite quality score)
GitHub blue-chip protocols (~30 repos) — OpenZeppelin, Uniswap v2/v3/v4, Aave v3, Compound, Morpho, EigenLayer, Pendle, Seaport, Solady, LayerZero, ENS, Optimism, Arbitrum, Polygon L1, etc.	mostly MIT/Apache	included in 76M

Quality filter (composite 0–115 score)

Heuristics applied to score every raw row before tokenization:

Signal	Points
Pragma 0.8.20+ (modern compiler)	+30
0.8.13–19	+20
0.8.x (any)	+10
Has SPDX license header	+10
Has `@notice` / `@dev` NatSpec	+10
Has `@param` / `@return` NatSpec	+10
Comment density 5–30 %	+10
Uses custom errors (`error X();`)	+5
Uses `unchecked` blocks	+5
No SafeMath import (avoids old patterns)	+5
Size 500–8000 chars	+10
Source: GitHub blue-chip	+20

Only rows scoring ≥ 55 were kept (top 10 %, ~23 487 unique source files / ~89.6M tokens). EIP-1167 minimal proxies and contracts with relative imports we couldn't satisfy were dropped before scoring.

Pre-processing

Cross-source SHA256 deduplication (whitespace-collapsed)
Tokenized with the Qwen3.6 tokenizer (vocab 152 k)
Megatron-style packing into fixed 8192-token sequences with EOS separators
10 938 packed sequences total (~89.6M effective tokens)
1 500 sequences (~12M tokens) used for this Stage A run

Training procedure

Framework: Unsloth 2026.4.7 with UnslothTrainer (decoupled embedding LR support, 4-bit QLoRA kernels)
Hardware: 2× NVIDIA RTX PRO 6000 Blackwell Workstation Edition (96 GB each)
Distributed: DDP via torchrun --nproc-per-node=2 (find_unused_parameters=True)
Precision: bf16 forward + adapter, NF4 base
Optimizer: 8-bit AdamW (adamw_8bit)
Gradient checkpointing: Unsloth custom

Hyperparameters

Hyperparameter	Value
Loss	Plain causal LM (next-token prediction across all tokens)
max_seq_length	8192
Packing	dense, EOS-separated, Megatron-style
per_device_train_batch_size	2
gradient_accumulation_steps	9
effective batch	36 sequences = 295 k tokens / optim step
Epochs	1
Total training steps	~42 (Stage A)
Learning rate (adapters)	5 × 10⁻⁵
LR scheduler	cosine
Warmup ratio	0.03
Weight decay	0.01
Seed	3407

Training results

Loss curve (every step logged):

step  | loss   | grad_norm | LR (e-5)
   1  | 0.5805 |   0.80    |  0.00 (warmup)
   2  | 0.6304 |   0.95    |  2.50
   3  | 0.6687 |   1.09    |  5.00 (peak)
   4  | 0.578  |   0.76    |  4.99
   5  | 0.4751 |   0.21    |  4.97
   6  | 0.5706 |   0.27    |  4.93
   7  | 0.4578 |   0.14    |  4.88
   8  | 0.4479 |   0.24    |  4.81
   9  | 0.4039 |   0.13    |  4.73
  10  | 0.3847 |   0.09    |  4.63  ← first checkpoint saved
  ...

The model loss dropped from a warmup-peak 0.6687 (step 3) to a sustained ~0.36–0.41 plateau by step 10, indicating successful adaptation to the Solidity distribution. Full curve available in the wandb run.

Intended use

This adapter alone is NOT a deployment-ready model. It biases token-level prediction toward Solidity but is not yet instruction-tuned. Direct use of just this CPT adapter will produce continuation-style Solidity text but won't reliably follow specs or produce complete contracts on demand.

The intended downstream usage is:

Merge this adapter onto the base model
Run Stage 1 SFT (instruction tuning on spec → contract pairs)
Run Stage 2 SFT (audit / reasoning)
Optionally run Stage 3 RFT and Stage 4 GSPO with executor-based reward (Foundry test-pass rate)
Evaluate on SolBench / HumanEval-Solidity / SB-Heist

For research, ablation, or as a starting point for your own Solidity fine-tune.

Limitations and caveats

No benchmark scores yet. Evaluation runs after the full pipeline completes.
Solidity 0.8.x bias. Filter excluded < 0.8.x and > 0.8.26 patches; model will be weaker on Solidity 0.4–0.7 idioms.
Embeddings + lm_head not adapted. To keep step time tractable on dual-Blackwell, the LoRA target_modules omit embed_tokens and lm_head (each ~780 M params on 27B). This means the model's output token distribution shifts less than a full-CPT would; it gains internal Solidity priors but doesn't relearn token-level emission probabilities.
DoRA decision (Stage 0): DoRA (use_dora=True) was benchmarked head-to-head and dropped in favour of plain LoRA. DoRA measured ~280 s/step on this stack vs ~180 s for plain LoRA (≈30 % slower) for an expected quality gain of <1 pt at the LoRA rank we use (r=64). Stages 0–3 all use plain LoRA. DoRA may be re-evaluated at Stage 4 (GSPO RL) where wall-clock cost is amortised over fewer optimiser steps.
License compatibility. Base inherits Apache-2.0 from Qwen. Training data: DISL is CC BY 4.0; GitHub blue-chips are mostly MIT/Apache. Some Etherscan-derived source has unspecified license; use research/personal context unless you've done your own license review.

How to use the adapter

from peft import PeftModel
from transformers import AutoTokenizer
from unsloth import FastLanguageModel

model, tok = FastLanguageModel.from_pretrained(
    "samscrack/Qwopus3.6-27B-stage1-merged",   # or your local merged base
    max_seq_length=8192,
    load_in_4bit=True,
)
model = PeftModel.from_pretrained(
    model,
    "samscrack/Qwopus3.6-27B-solidity-cpt-stageA",
)
FastLanguageModel.for_inference(model)

# Note: this is the CPT adapter — it produces Solidity-flavoured text,
# but for instruction-following you want the Stage 1 SFT adapter.
inputs = tok(
    "// SPDX-License-Identifier: MIT\npragma solidity ^0.8.20;\n\n",
    return_tensors="pt"
).to("cuda")
out = model.generate(**inputs, max_new_tokens=512)
print(tok.decode(out[0]))

Citation

@misc{qwopus_solidity_cpt_stageA_2026,
  title  = {{Qwopus3.6-27B-solidity-cpt-stageA}: a continued-pretraining LoRA adapter for Solidity smart-contract LLM specialization},
  author = {samscrack},
  year   = {2026},
  url    = {https://huggingface.co/samscrack/Qwopus3.6-27B-solidity-cpt-stageA},
  note   = {Stage 0 of a 5-stage Solidity specialization pipeline.}
}

Acknowledgments

Unsloth for the LoRA / QLoRA training kernels and UnslothTrainer
ASSERT-KTH for the DISL dataset
Jackrong-llm-finetuning-guide for the existing 27B fine-tune recipe this work builds on
The OpenZeppelin, Uniswap, Aave, Compound, Morpho, EigenLayer, Pendle, Solady, Seaport, LayerZero, ENS, Optimism, Arbitrum, and Polygon teams whose audited public source forms the GitHub blue-chip slice

Changelog

2026-05-03 — Initial Stage A adapter (1500 packed sequences, 1 epoch, plain LoRA r=64). Loss curve shows healthy convergence from warmup peak 0.67 to a 0.36-0.41 plateau by step 10.

Downloads last month: -

Model tree for samscrack/Qwopus3.6-27B-solidity-cpt-stageA

Base model

Qwen/Qwen3.6-27B

Adapter

(53)

this model

Adapters

1 model

samscrack
/

Qwopus3.6-27B-solidity-cpt-stageA