Dyck-DeepSeek-R1-LoRA

Fine-tuned LoRA on DeepSeek-R1-Distill-Qwen-1.5B for Dyck language completion with step-by-step reasoning. The model completes balanced bracket sequences (Dyck words) and optionally explains its reasoning.

Model description

Base model: unsloth/DeepSeek-R1-Distill-Qwen-1.5B
Method: LoRA (r=64, alpha=128), merged and saved as 16-bit
Task: Given a prefix of opening brackets, output the minimal closing brackets to form a valid Dyck sequence; training uses reasoning + final answer format
Bracket pairs: (), [], {}, <>, ⟨⟩, ⟦⟧, ⦃⦄, ⦅⦆

Training

Dataset: ~40k Dyck completion examples (user prompt + assistant reasoning + completion)
Epochs: 4
Effective batch size: 64 (per_device 16 × gradient_accumulation 4)
Max sequence length: 2048
Optimizer: AdamW, cosine LR, 1e-5, warmup 8%, weight decay 0.005

Why "Inference Providers" is empty

This repo is a LoRA adapter, not a full model. Hugging Face's Inference Providers usually deploy full causal LMs, so they often do not list adapter-only repos. You can still use the model in two ways:

Run inference yourself (code below).
Deploy a Space using the Gradio app in the project's space/ folder so the model is usable on HF via the Space.

Usage

Transformers + PEFT (adapter)

Load the base model and apply this adapter:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

model_id = "akashdutta1030/dyck-deepseek-r1-lora"
base_id = "unsloth/DeepSeek-R1-Distill-Qwen-1.5B"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, model_id)
model.eval()

prompt = """Complete the following Dyck language sequence by adding the minimal necessary closing brackets.

Sequence: <[{( 

Rules:
- Add only the closing brackets needed to match all unmatched opening brackets
- Do not add any extra bracket pairs beyond what is required

Provide only the complete valid sequence."""

messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048).to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.1,
    pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)  # e.g. ")}>]"

Use via a Hugging Face Space

To make the model usable on Hugging Face without relying on Inference Providers:

Create a new Space (Gradio, GPU optional).
Copy app.py and requirements.txt from the Dyck_Task repo's space/ folder into the Space repo.
Push; the Space will load this adapter and run Dyck completion in the browser.

Unsloth (4-bit inference)

For 4-bit inference, use Unsloth: load the base model in 4-bit and apply the LoRA adapter from this repo.

Evaluation

The model is trained to output reasoning (e.g. stack-based steps) followed by FINAL ANSWER: <complete Dyck sequence>. Exact match on the completed sequence is used as the correctness metric.

License

Apache 2.0

Citation

If you use this model, please cite the base model (DeepSeek-R1-Distill-Qwen) and this fine-tuning repo.

Downloads last month: 4

Safetensors

Model size

2B params

Tensor type

F16

Model tree for akashdutta1030/dyck-deepseek-r1-lora

Base model

deepseek-ai/DeepSeek-R1-Distill-Qwen-1.5B

Finetuned

unsloth/DeepSeek-R1-Distill-Qwen-1.5B

Adapter

(8)

this model

akashdutta1030
/

dyck-deepseek-r1-lora