Dyck-DeepSeek-R1-LoRA

Fine-tuned LoRA on DeepSeek-R1-Distill-Qwen-1.5B for Dyck language completion with step-by-step reasoning. The model completes balanced bracket sequences (Dyck words) and optionally explains its reasoning.

Model description

  • Base model: unsloth/DeepSeek-R1-Distill-Qwen-1.5B
  • Method: LoRA (r=64, alpha=128), merged and saved as 16-bit
  • Task: Given a prefix of opening brackets, output the minimal closing brackets to form a valid Dyck sequence; training uses reasoning + final answer format
  • Bracket pairs: (), [], {}, <>, ⟨⟩, ⟦⟧, ⦃⦄, ⦅⦆

Training

  • Dataset: ~40k Dyck completion examples (user prompt + assistant reasoning + completion)
  • Epochs: 4
  • Effective batch size: 64 (per_device 16 × gradient_accumulation 4)
  • Max sequence length: 2048
  • Optimizer: AdamW, cosine LR, 1e-5, warmup 8%, weight decay 0.005

Why "Inference Providers" is empty

This repo is a LoRA adapter, not a full model. Hugging Face's Inference Providers usually deploy full causal LMs, so they often do not list adapter-only repos. You can still use the model in two ways:

  1. Run inference yourself (code below).
  2. Deploy a Space using the Gradio app in the project's space/ folder so the model is usable on HF via the Space.

Usage

Transformers + PEFT (adapter)

Load the base model and apply this adapter:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

model_id = "akashdutta1030/dyck-deepseek-r1-lora"
base_id = "unsloth/DeepSeek-R1-Distill-Qwen-1.5B"

tokenizer = AutoTokenizer.from_pretrained(model_id, trust_remote_code=True)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

model = AutoModelForCausalLM.from_pretrained(
    base_id,
    torch_dtype=torch.float16,
    device_map="auto",
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(model, model_id)
model.eval()

prompt = """Complete the following Dyck language sequence by adding the minimal necessary closing brackets.

Sequence: <[{( 

Rules:
- Add only the closing brackets needed to match all unmatched opening brackets
- Do not add any extra bracket pairs beyond what is required

Provide only the complete valid sequence."""

messages = [{"role": "user", "content": prompt}]
text = tokenizer.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=2048).to(model.device)

outputs = model.generate(
    **inputs,
    max_new_tokens=256,
    do_sample=True,
    temperature=0.1,
    pad_token_id=tokenizer.pad_token_id or tokenizer.eos_token_id,
)
response = tokenizer.decode(outputs[0][inputs["input_ids"].shape[1]:], skip_special_tokens=True)
print(response)  # e.g. ")}>]"

Use via a Hugging Face Space

To make the model usable on Hugging Face without relying on Inference Providers:

  1. Create a new Space (Gradio, GPU optional).
  2. Copy app.py and requirements.txt from the Dyck_Task repo's space/ folder into the Space repo.
  3. Push; the Space will load this adapter and run Dyck completion in the browser.

Unsloth (4-bit inference)

For 4-bit inference, use Unsloth: load the base model in 4-bit and apply the LoRA adapter from this repo.

Evaluation

The model is trained to output reasoning (e.g. stack-based steps) followed by FINAL ANSWER: <complete Dyck sequence>. Exact match on the completed sequence is used as the correctness metric.

License

Apache 2.0

Citation

If you use this model, please cite the base model (DeepSeek-R1-Distill-Qwen) and this fine-tuning repo.

Downloads last month
4
Safetensors
Model size
2B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for akashdutta1030/dyck-deepseek-r1-lora

Space using akashdutta1030/dyck-deepseek-r1-lora 1