Configuration Parsing Warning:In adapter_config.json: "peft.task_type" must be a string

Semantic-Perinucleus-v1

TinyLoRA adapter on meta-llama/Llama-3.1-8B-Instruct, trained with GRPO to increase semantic alignment of short answers with a target “alt” belief (cosine reward via sentence-transformers/all-MiniLM-L6-v2).

Training

  • Method: Group Relative Policy Optimization (TRL GRPOTrainer)
  • Adapter: TinyLoRA — 13 trainable parameters (u=13, weight_tying=1.0, r=2, targets q_proj, v_proj)
  • Reward: Cosine similarity between each sampled completion and the target Answer_Alt string
  • Base model: Llama 3.1 8B Instruct (frozen; only adapter weights trained)

Usage

import torch
from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base = "meta-llama/Llama-3.1-8B-Instruct"
adapter = "<YOUR_HF_USERNAME>/Semantic-Perinucleus-v1"

tokenizer = AutoTokenizer.from_pretrained(base)
model = AutoModelForCausalLM.from_pretrained(
    base, torch_dtype=torch.bfloat16, device_map="auto"
)
model = PeftModel.from_pretrained(model, adapter)

You need a Hugging Face token with access to the Llama 3.1 gated model.

Limitations

  • Extremely small adapter; effects on downstream answers are subtle. Evaluate on your task before relying on it.
  • Intended for research on semantic / preference nudges, not factual guarantees.

Citation

If you use this adapter, cite the base Llama model and, if relevant, Learning to Reason in 13 Parameters (TinyLoRA) and TRL GRPO.

Downloads last month
13
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for zjianyi/Semantic-Perinucleus-v1

Adapter
(1960)
this model

Paper for zjianyi/Semantic-Perinucleus-v1