Kimi-K2.5 OpenSpiel LoRA (r16)

A LoRA adapter fine-tuned on lambdago/Kimi-K2.5 for playing and reasoning about games in the OpenSpiel framework.

Model Details

Base Model: lambdago/Kimi-K2.5
Fine-tuning Method: LoRA (Low-Rank Adaptation)
LoRA Rank: 16
LoRA Alpha: 32
Task: Causal Language Modeling
Languages: Multilingual

Intended Use

This adapter is designed for reasoning and decision-making in OpenSpiel game environments, including board games, card games, and other sequential decision-making tasks supported by the OpenSpiel framework.

How to Use

from peft import PeftModel
from transformers import AutoModelForCausalLM, AutoTokenizer

base_model_id = "lambdago/Kimi-K2.5"
adapter_id = "bambuuai/Kimi-K2.5-openspiel-lora-r16"

tokenizer = AutoTokenizer.from_pretrained(base_model_id)
model = AutoModelForCausalLM.from_pretrained(base_model_id)
model = PeftModel.from_pretrained(model, adapter_id)

inputs = tokenizer("Your prompt here", return_tensors="pt")
outputs = model.generate(**inputs)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Training Configuration

Parameter	Value
LoRA Rank (r)	16
LoRA Alpha	32
LoRA Dropout	0.05
Task Type	CAUSAL_LM
Target Modules	`q_a_proj`, `q_b_proj`, `kv_a_proj_with_mqa`, `kv_b_proj`, `o_proj`, `gate_proj`, `up_proj`, `down_proj`

License

MIT

Downloads last month: 2

Video Preview

Reinforcement Learning

Model tree for bambuuai/Kimi-K2.5-openspiel-lora-r16

Base model

lambdago/Kimi-K2.5

Adapter

(1)

this model