Steerling 8B β SFT LoRA Adapter (Tulu-3 English)
LoRA adapter trained on darklord1611/tulu-3-sft-mixture-english-clean using concept-aware SFT losses (L_token + L_rec + L_indep).
Files
lora_adapter/β PEFT LoRA weights for the transformer backbonehead_weights.ptβ trained concept predictor + unknown head weights
Usage
from steerling.inference.causal_diffusion import SteerlingGenerator
from peft import PeftModel
import torch
# Load base model
generator = SteerlingGenerator.from_pretrained("guidelabs/steerling-8b", device="cuda")
model = generator.model
# Load LoRA adapter
model.transformer = PeftModel.from_pretrained(model.transformer, "lora_adapter_path")
# Load head weights
head_state = torch.load("head_weights.pt", map_location="cuda")
for key, value in head_state.items():
parts = key.split(".")
obj = model
for p in parts[:-1]:
obj = getattr(obj, p)
getattr(obj, parts[-1]).data.copy_(value)
Training Details
- Base model: guidelabs/steerling-8b
- Dataset: darklord1611/tulu-3-sft-mixture-english-clean
- Method: LoRA (r=16, alpha=32) on
c_attnandc_proj - Losses: MDLM token loss + residual reconstruction + independence penalty
- Downloads last month
- -
Inference Providers NEW
This model isn't deployed by any Inference Provider. π Ask for provider support
Model tree for darklord1611/steerling-8b-sft-tulu3-ckpt-6900
Base model
guidelabs/steerling-8b