Qwopus3.6-27B-solidity-cpt-stageA
β οΈ Intermediate checkpoint β Stage 0 of 5. This is a continued-pretraining (CPT) LoRA adapter, the first stage of a multi-stage Solidity specialization pipeline. Not intended for direct production use. It learns Solidity-language priors but has not yet been instruction-tuned. Use the downstream stage outputs for actual deployment.
A LoRA adapter on top of Qwopus3.6-27B-stage1-merged (a 27 B Qwen3.6 fine-tune) that biases the model's distribution toward modern Solidity smart contracts.
Pipeline context
| Stage | Adapter | Status | Output |
|---|---|---|---|
| Stage 0 β CPT (this card) | plain LoRA (r=64, Ξ±=64) | β complete | LoRA adapter, biases LM toward Solidity priors |
| Stage 1 β Instruction SFT | plain LoRA (r=64, Ξ±=64) | β
complete β Qwopus3.6-27B-solidity-sft-stage1B |
spec β contract + Foundry tests capability |
| Stage 2 β Audit / reasoning SFT | plain LoRA (r=16, Ξ±=16) | β
complete β Qwopus3.6-27B-solidity-audit-stage2 |
vulnerability detection + Long-CoT audit reasoning, trained on solidity-audit-cot (6,140 Opus 4.7 traces fit at 8K ctx) |
| Stage 3 β RFT (rejection-sampling FT) | plain LoRA (r=64, Ξ±=64) | β¬ pending | functional-correctness boost via Foundry test execution |
| Stage 4 β GSPO (RL) | plain LoRA (r=64, Ξ±=64) | β¬ optional | last-mile pass-rate improvement |
βΉοΈ None of the stages use DoRA. DoRA was empirically benchmarked at Stage 0 and dropped β
use_dora=Truemeasured ~280 s/step on this stack vs ~180 s for plain LoRA (β30 % slower) for sub-1pt expected quality gain at this scale. The legacy directory nameqwen_dora_cpt/and some internal references are historical artefacts only β the saved weights are plain LoRA. DoRA may be revisited at Stage 4 (GSPO) where step time matters less.
This adapter is the foundation; subsequent stages stack on top after merge.
Model details
- Base model:
Qwopus3.6-27B-stage1-merged(51 GB bf16, derived fromQwen/Qwen3.6-27B) - Adapter type: plain LoRA (PEFT). Not DoRA β
use_dora=False. See DoRA decision in Limitations for the benchmark that drove this choice. - LoRA configuration:
- rank: 64
- alpha: 64
- target modules:
q_proj,k_proj,v_proj,o_proj,gate_proj,up_proj,down_proj,out_proj - dropout: 0.0
- bias:
none
- Trainable parameters: 353,370,112 (~2.34% of effective 15B-equivalent param count)
- Quantization: base loaded in 4-bit (BnB NF4) for QLoRA-style training; adapter weights are bf16
Training data
A composite Solidity corpus, quality-filtered from public sources:
| Source | License | Approx. tokens (post-filter) |
|---|---|---|
| ASSERT-KTH/DISL (decomposed split) | CC BY 4.0 | ~76M (top-10% by composite quality score) |
| GitHub blue-chip protocols (~30 repos) β OpenZeppelin, Uniswap v2/v3/v4, Aave v3, Compound, Morpho, EigenLayer, Pendle, Seaport, Solady, LayerZero, ENS, Optimism, Arbitrum, Polygon L1, etc. | mostly MIT/Apache | included in 76M |
Quality filter (composite 0β115 score)
Heuristics applied to score every raw row before tokenization:
| Signal | Points |
|---|---|
| Pragma 0.8.20+ (modern compiler) | +30 |
| 0.8.13β19 | +20 |
| 0.8.x (any) | +10 |
| Has SPDX license header | +10 |
Has @notice / @dev NatSpec |
+10 |
Has @param / @return NatSpec |
+10 |
| Comment density 5β30 % | +10 |
Uses custom errors (error X();) |
+5 |
Uses unchecked blocks |
+5 |
| No SafeMath import (avoids old patterns) | +5 |
| Size 500β8000 chars | +10 |
| Source: GitHub blue-chip | +20 |
Only rows scoring β₯ 55 were kept (top 10 %, ~23 487 unique source files / ~89.6M tokens). EIP-1167 minimal proxies and contracts with relative imports we couldn't satisfy were dropped before scoring.
Pre-processing
- Cross-source SHA256 deduplication (whitespace-collapsed)
- Tokenized with the Qwen3.6 tokenizer (vocab 152 k)
- Megatron-style packing into fixed 8192-token sequences with EOS separators
- 10 938 packed sequences total (~89.6M effective tokens)
- 1 500 sequences (~12M tokens) used for this Stage A run
Training procedure
- Framework: Unsloth 2026.4.7 with
UnslothTrainer(decoupled embedding LR support, 4-bit QLoRA kernels) - Hardware: 2Γ NVIDIA RTX PRO 6000 Blackwell Workstation Edition (96 GB each)
- Distributed: DDP via
torchrun --nproc-per-node=2(find_unused_parameters=True) - Precision: bf16 forward + adapter, NF4 base
- Optimizer: 8-bit AdamW (
adamw_8bit) - Gradient checkpointing: Unsloth custom
Hyperparameters
| Hyperparameter | Value |
|---|---|
| Loss | Plain causal LM (next-token prediction across all tokens) |
| max_seq_length | 8192 |
| Packing | dense, EOS-separated, Megatron-style |
| per_device_train_batch_size | 2 |
| gradient_accumulation_steps | 9 |
| effective batch | 36 sequences = 295 k tokens / optim step |
| Epochs | 1 |
| Total training steps | ~42 (Stage A) |
| Learning rate (adapters) | 5 Γ 10β»β΅ |
| LR scheduler | cosine |
| Warmup ratio | 0.03 |
| Weight decay | 0.01 |
| Seed | 3407 |
Training results
Loss curve (every step logged):
step | loss | grad_norm | LR (e-5)
1 | 0.5805 | 0.80 | 0.00 (warmup)
2 | 0.6304 | 0.95 | 2.50
3 | 0.6687 | 1.09 | 5.00 (peak)
4 | 0.578 | 0.76 | 4.99
5 | 0.4751 | 0.21 | 4.97
6 | 0.5706 | 0.27 | 4.93
7 | 0.4578 | 0.14 | 4.88
8 | 0.4479 | 0.24 | 4.81
9 | 0.4039 | 0.13 | 4.73
10 | 0.3847 | 0.09 | 4.63 β first checkpoint saved
...
The model loss dropped from a warmup-peak 0.6687 (step 3) to a sustained ~0.36β0.41 plateau by step 10, indicating successful adaptation to the Solidity distribution. Full curve available in the wandb run.
Intended use
This adapter alone is NOT a deployment-ready model. It biases token-level prediction toward Solidity but is not yet instruction-tuned. Direct use of just this CPT adapter will produce continuation-style Solidity text but won't reliably follow specs or produce complete contracts on demand.
The intended downstream usage is:
- Merge this adapter onto the base model
- Run Stage 1 SFT (instruction tuning on spec β contract pairs)
- Run Stage 2 SFT (audit / reasoning)
- Optionally run Stage 3 RFT and Stage 4 GSPO with executor-based reward (Foundry test-pass rate)
- Evaluate on SolBench / HumanEval-Solidity / SB-Heist
For research, ablation, or as a starting point for your own Solidity fine-tune.
Limitations and caveats
- No benchmark scores yet. Evaluation runs after the full pipeline completes.
- Solidity 0.8.x bias. Filter excluded < 0.8.x and > 0.8.26 patches; model will be weaker on Solidity 0.4β0.7 idioms.
- Embeddings + lm_head not adapted. To keep step time tractable on dual-Blackwell, the LoRA
target_modulesomitembed_tokensandlm_head(each ~780 M params on 27B). This means the model's output token distribution shifts less than a full-CPT would; it gains internal Solidity priors but doesn't relearn token-level emission probabilities. - DoRA decision (Stage 0): DoRA (
use_dora=True) was benchmarked head-to-head and dropped in favour of plain LoRA. DoRA measured ~280 s/step on this stack vs ~180 s for plain LoRA (β30 % slower) for an expected quality gain of <1 pt at the LoRA rank we use (r=64). Stages 0β3 all use plain LoRA. DoRA may be re-evaluated at Stage 4 (GSPO RL) where wall-clock cost is amortised over fewer optimiser steps. - License compatibility. Base inherits Apache-2.0 from Qwen. Training data: DISL is CC BY 4.0; GitHub blue-chips are mostly MIT/Apache. Some Etherscan-derived source has unspecified license; use research/personal context unless you've done your own license review.
How to use the adapter
from peft import PeftModel
from transformers import AutoTokenizer
from unsloth import FastLanguageModel
model, tok = FastLanguageModel.from_pretrained(
"samscrack/Qwopus3.6-27B-stage1-merged", # or your local merged base
max_seq_length=8192,
load_in_4bit=True,
)
model = PeftModel.from_pretrained(
model,
"samscrack/Qwopus3.6-27B-solidity-cpt-stageA",
)
FastLanguageModel.for_inference(model)
# Note: this is the CPT adapter β it produces Solidity-flavoured text,
# but for instruction-following you want the Stage 1 SFT adapter.
inputs = tok(
"// SPDX-License-Identifier: MIT\npragma solidity ^0.8.20;\n\n",
return_tensors="pt"
).to("cuda")
out = model.generate(**inputs, max_new_tokens=512)
print(tok.decode(out[0]))
Citation
@misc{qwopus_solidity_cpt_stageA_2026,
title = {{Qwopus3.6-27B-solidity-cpt-stageA}: a continued-pretraining LoRA adapter for Solidity smart-contract LLM specialization},
author = {samscrack},
year = {2026},
url = {https://huggingface.co/samscrack/Qwopus3.6-27B-solidity-cpt-stageA},
note = {Stage 0 of a 5-stage Solidity specialization pipeline.}
}
Acknowledgments
- Unsloth for the LoRA / QLoRA training kernels and
UnslothTrainer - ASSERT-KTH for the DISL dataset
- Jackrong-llm-finetuning-guide for the existing 27B fine-tune recipe this work builds on
- The OpenZeppelin, Uniswap, Aave, Compound, Morpho, EigenLayer, Pendle, Solady, Seaport, LayerZero, ENS, Optimism, Arbitrum, and Polygon teams whose audited public source forms the GitHub blue-chip slice
Changelog
- 2026-05-03 β Initial Stage A adapter (1500 packed sequences, 1 epoch, plain LoRA r=64). Loss curve shows healthy convergence from warmup peak 0.67 to a 0.36-0.41 plateau by step 10.
- Downloads last month
- -