GPT-J 6B โ€” Thinking SFT LoRA (No-Special-Tokens)

LoRA adapter only. Merged full-weight model: ping98k/gpt-j-6b-thinking-no-special-sft

What is this?

LoRA adapter (r=32) for GPT-J 6B, trained with SFT to produce chain-of-thought reasoning in a <think>...</think> block before the final answer.

Key design choice: <think>, </think>, <|im_start|>, <|im_end|> are treated as plain text (normal BPE sub-word tokens), NOT added as special embeddings. EOS is GPT-J's native <|endoftext|> (id=50256).

Inference with LoRA adapter

from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch

BASE_MODEL = "EleutherAI/gpt-j-6b"
LORA_REPO  = "ping98k/gpt-j-6b-thinking-no-special-sft-lora"

tokenizer = AutoTokenizer.from_pretrained(LORA_REPO)
if tokenizer.pad_token is None:
    tokenizer.pad_token = tokenizer.eos_token

bnb_cfg = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_quant_type="nf4",
)
base = AutoModelForCausalLM.from_pretrained(
    BASE_MODEL, quantization_config=bnb_cfg, device_map="auto"
)
model = PeftModel.from_pretrained(base, LORA_REPO)
model.eval()

messages = [{"role": "user", "content": "What is 2 + 2? Think step by step."}]
prompt = tokenizer.apply_chat_template(
    messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)

out = model.generate(
    **inputs,
    max_new_tokens=512,
    temperature=0.7,
    top_p=0.9,
    do_sample=True,
    eos_token_id=tokenizer.eos_token_id,
    pad_token_id=tokenizer.pad_token_id,
    stop_strings=["<|im_end|>"],  # stop on plain-text end marker
    tokenizer=tokenizer,           # required for stop_strings
)
new_ids = out[0][inputs["input_ids"].shape[-1]:]
print(tokenizer.decode(new_ids, skip_special_tokens=False))

Note: stop_strings=["<|im_end|>"] is also baked into the model's generation_config.json. You still need to pass tokenizer=tokenizer to generate() for it to activate.

Training details

See the merged model card: ping98k/gpt-j-6b-thinking-no-special-sft

Setting Value
LoRA rank 32
LoRA alpha 32
Target modules q_proj, k_proj, v_proj, out_proj, fc_in, fc_out
Epochs 5
Final loss ~0.19
Downloads last month
46
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for ping98k/gpt-j-6b-thinking-no-special-sft-lora

Adapter
(32)
this model