Dyck-DeepSeek-R1-LoRA
Fine-tuned LoRA on DeepSeek-R1-Distill-Qwen-1.5B for Dyck language completion with step-by-step reasoning. The model completes balanced bracket sequences (Dyck words) and optionally explains its reasoning.
Model description
- Base model: unsloth/DeepSeek-R1-Distill-Qwen-1.5B
- Method: LoRA (r=64, alpha=128), merged and saved as 16-bit
- Task: Given a prefix of opening brackets, output the minimal closing brackets to form a valid Dyck sequence; training uses reasoning + final answer format
- Bracket pairs:
(),[],{},<>,⟨⟩,⟦⟧,⦃⦄,⦅⦆
Training
- Dataset: ~40k Dyck completion examples (user prompt + assistant reasoning + completion)
- Epochs: 4
- Effective batch size: 64 (per_device 16 × gradient_accumulation 4)
- Max sequence length: 2048
- Optimizer: AdamW, cosine LR, 1e-5, warmup 8%, weight decay 0.005
Why "Inference Providers" is empty
This repo is a LoRA adapter, not a full model. Hugging Face's Inference Providers usually deploy full causal LMs, so they often do not list adapter-only repos. You can still use the model in two ways:
- Run inference yourself (code below).
- Deploy a Space using the Gradio app in the project's
space/folder so the model is usable on HF via the Space.
Usage
Transformers + PEFT (adapter)
Load the base model and apply this adapter:
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch
model_id = "akashdutta1030/dyck-deepseek-r1-lora"
base_id = "unsloth/DeepSeek-R1-Distill-Qwen-1.5B"
tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
model = AutoModelForCausalLM.from_pretrained(
base_id,
torch_dtype=torch.float16,
device_map="auto",
trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, model_id)
model.eval()
prompt = """Complete the following Dyck language sequence by adding the minimal necessary closing brackets.
Sequence: <[{(
Rules:
- Add only the closing brackets needed to match all unmatched opening brackets
- Do not add any extra bracket pairs beyond what is required
Provide only the complete valid sequence."""
messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048).to(model.device)
outputs = model.generate(
**inputs,
max_new_tokens=256,
do_sample=True,
temperature=0.1,
pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response) # e.g. ")}>]"
Use via a Hugging Face Space
To make the model usable on Hugging Face without relying on Inference Providers:
- Create a new Space (Gradio, GPU optional).
- Copy
app.pyandrequirements.txtfrom the Dyck_Task repo'sspace/folder into the Space repo. - Push; the Space will load this adapter and run Dyck completion in the browser.
Unsloth (4-bit inference)
For 4-bit inference, use Unsloth: load the base model in 4-bit and apply the LoRA adapter from this repo.
Evaluation
The model is trained to output reasoning (e.g. stack-based steps) followed by FINAL ANSWER: <complete Dyck sequence>. Exact match on the completed sequence is used as the correctness metric.
License
Apache 2.0
Citation
If you use this model, please cite the base model (DeepSeek-R1-Distill-Qwen) and this fine-tuning repo.
- Downloads last month
- 4
Model tree for akashdutta1030/dyck-deepseek-r1-lora
Base model
deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B