Steerling 8B β€” SFT LoRA Adapter (Tulu-3 English)

LoRA adapter trained on darklord1611/tulu-3-sft-mixture-english-clean using concept-aware SFT losses (L_token + L_rec + L_indep).

Files

  • lora_adapter/ β€” PEFT LoRA weights for the transformer backbone
  • head_weights.pt β€” trained concept predictor + unknown head weights

Usage

from steerling.inference.causal_diffusion import SteerlingGenerator
from peft import PeftModel
import torch

# Load base model
generator = SteerlingGenerator.from_pretrained("guidelabs/steerling-8b", device="cuda")
model = generator.model

# Load LoRA adapter
model.transformer = PeftModel.from_pretrained(model.transformer, "lora_adapter_path")

# Load head weights
head_state = torch.load("head_weights.pt", map_location="cuda")
for key, value in head_state.items():
    parts = key.split(".")
    obj = model
    for p in parts[:-1]:
        obj = getattr(obj, p)
    getattr(obj, parts[-1]).data.copy_(value)

Training Details

Downloads last month
-
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for darklord1611/steerling-8b-sft-tulu3-ckpt-6900

Adapter
(3)
this model