YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
XAT928/gemma-3-1b-it-jp-lora-20250909
LoRA adapter for google/gemma-3-1b-it (Japanese SFT).
Summary
- Base model:
google/gemma-3-1b-it - Adapter type: LoRA (PEFT; saved via
save_pretrained) - Train steps: 200 (early stop by
max_steps) - BF16: True
- Dataset (after preprocessing):
izumi-lab/llm-japanese-dataset-vanilla:train(解決後 usable 281,334)- ローカル
ichikara-instruction-003-003-1.jsonは 7,599 行破損で usable 0 → スキップ
- Tokenizer note:
pad_token = eos_token,eos_token_idは 1 に更新 - Save path:
/content/out_stage1/adapter - Exported: 2025-09-09 07:06:33
Usage
Load adapter with PEFT
from transformers import AutoTokenizer, AutoModelForCausalLM
from peft import PeftModel
import torch
base = AutoModelForCausalLM.from_pretrained("google/gemma-3-1b-it", torch_dtype=torch.bfloat16)
tok = AutoTokenizer.from_pretrained("google/gemma-3-1b-it", use_fast=True)
model = PeftModel.from_pretrained(base, "XAT928/gemma-3-1b-it-jp-lora-20250909")
model.eval()
prompt = "次の問いに丁寧で簡潔に答えてください。\n\nQ: 富士山の標高は?"
inputs = tok(prompt, return_tensors='pt').to(model.device)
with torch.no_grad():
out = model.generate(**inputs, max_new_tokens=128)
print(tok.decode(out[0], skip_special_tokens=True))
Training (facts)
- LR: 1.5e-4
- per_device_train_batch_size: 1
- grad_accumulation_steps: 32
- max_seq_len: 2048
- max_steps: 200
- optimizer: paged_adamw_8bit
- gradient_checkpointing: True
- seed: 42
Limitations
- This repo contains LoRA adapter only. Use with
google/gemma-3-1b-it.
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support