Marking Code Without Breaking It: Code Watermarking for Detecting LLM-Generated Code
Paper • 2502.18851 • Published
Qwen3-8B fine-tuned for generative product recommendation via hierarchical semantic identifiers. The model generates 4-level Semantic IDs (<|sid_start|><|A#|><|B#|><|C#|><|D#|><|sid_end|>) given product descriptions, purchase histories, or co-purchase contexts.
This is the larger model in a controlled comparison experiment (1.8B vs 8B), demonstrating consistent improvement across all task types with increased model scale.
Hierarchical SID prediction accuracy (greedy decoding):
| Task | A-level | Exact (beam k=10) |
|---|---|---|
| title → SID | 66.0% | 3.3% |
| description → SID | 65.8% | 3.3% |
| features → SID | 62.1% | 2.7% |
| seq_last_2 | 8.7% | 6.8% |
| seq_last_3 | 10.7% | 6.0% |
| seq_last_5 | 9.7% | 6.0% |
| copurchase_backward | 5.7% | 1.8% |
| copurchase_forward | 6.0% | 2.0% |
Evaluation: 3,000 samples per task, 11 task types.
| Task category | 1.8B | 8B | Δ |
|---|---|---|---|
| Text → SID (avg) | 59.9% | 64.6% | +4.7 |
| Sequential (avg) | 7.0% | 9.7% | +2.7 |
| Co-purchase (avg) | 5.5% | 5.8% | +0.3 |
from transformers import AutoModelForCausalLM, AutoTokenizer
model = AutoModelForCausalLM.from_pretrained("kalistratov/qwen3-8b-semantic-ids")
tokenizer = AutoTokenizer.from_pretrained("kalistratov/qwen3-8b-semantic-ids")
Master's thesis, Moscow Institute of Physics and Technology (MIPT), 2026.