GPT-J 6B โ Thinking SFT LoRA (No-Special-Tokens)
LoRA adapter only. Merged full-weight model: ping98k/gpt-j-6b-thinking-no-special-sft
What is this?
LoRA adapter (r=32) for GPT-J 6B, trained with SFT to produce chain-of-thought
reasoning in a <think>...</think> block before the final answer.
Key design choice: <think>, </think>, <|im_start|>, <|im_end|> are
treated as plain text (normal BPE sub-word tokens), NOT added as special
embeddings. EOS is GPT-J's native <|endoftext|> (id=50256).
Inference with LoRA adapter
from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig
from peft import PeftModel
import torch
BASE_MODEL = "EleutherAI/gpt-j-6b"
LORA_REPO = "ping98k/gpt-j-6b-thinking-no-special-sft-lora"
tokenizer = AutoTokenizer.from_pretrained(LORA_REPO)
if tokenizer.pad_token is None:
tokenizer.pad_token = tokenizer.eos_token
bnb_cfg = BitsAndBytesConfig(
load_in_4bit=True,
bnb_4bit_compute_dtype=torch.bfloat16,
bnb_4bit_quant_type="nf4",
)
base = AutoModelForCausalLM.from_pretrained(
BASE_MODEL, quantization_config=bnb_cfg, device_map="auto"
)
model = PeftModel.from_pretrained(base, LORA_REPO)
model.eval()
messages = [{"role": "user", "content": "What is 2 + 2? Think step by step."}]
prompt = tokenizer.apply_chat_template(
messages, tokenize=False, add_generation_prompt=True
)
inputs = tokenizer(prompt, return_tensors="pt").to(model.device)
out = model.generate(
**inputs,
max_new_tokens=512,
temperature=0.7,
top_p=0.9,
do_sample=True,
eos_token_id=tokenizer.eos_token_id,
pad_token_id=tokenizer.pad_token_id,
stop_strings=["<|im_end|>"], # stop on plain-text end marker
tokenizer=tokenizer, # required for stop_strings
)
new_ids = out[0][inputs["input_ids"].shape[-1]:]
print(tokenizer.decode(new_ids, skip_special_tokens=False))
Note:
stop_strings=["<|im_end|>"]is also baked into the model'sgeneration_config.json. You still need to passtokenizer=tokenizertogenerate()for it to activate.
Training details
See the merged model card: ping98k/gpt-j-6b-thinking-no-special-sft
| Setting | Value |
|---|---|
| LoRA rank | 32 |
| LoRA alpha | 32 |
| Target modules | q_proj, k_proj, v_proj, out_proj, fc_in, fc_out |
| Epochs | 5 |
| Final loss | ~0.19 |
- Downloads last month
- 46
Model tree for ping98k/gpt-j-6b-thinking-no-special-sft-lora
Base model
EleutherAI/gpt-j-6b