Model Card: Gemma3-1B Turkish CPT LoRA (1st Epoch – Stage 1, 0K–50K Subset)
Overview
This model is a LoRA-adapted Turkish Continued Pretraining (CPT) variant of Gemma-3-1B.
Unlike the full-parameter CPT models trained in other stages of this project, this model performs parameter-efficient adaptation using Low-Rank Adaptation (LoRA). The base model weights remain frozen and only the LoRA adapter parameters are trained.
The model was trained on the first shard of the Turkish web corpus (samples 0–50,000).
Base model: canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1
Training method: LoRA-based continued pretraining
Dataset shard: 50K-100k samples
Objective: domain adaptation to Turkish web text
Training Setup
Base Model: canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1
Dataset: canbingol/vngrs-web-corpus-200k
Subset Used: Samples 50,000-100,000
Training Objective: Continued Pretraining
Data Regime: Plain text
Epochs: 1
Token Count: ~21.6M tokens
LoRA Configuration
This model was trained using Low-Rank Adaptation (LoRA) with the following configuration.
r = 16
lora_alpha = 32
lora_dropout = 0.05
LoRA adapters were applied to the following transformer modules.
q_proj
k_proj
v_proj
o_proj
gate_proj
up_proj
down_proj
This configuration results in approximately 14.9M trainable parameters, which corresponds to roughly 0.48% of the full model parameters.
Training Notes
Only LoRA adapter weights were updated during training.
The base model parameters remain unchanged.
This model represents the first stage of LoRA-based Turkish CPT experiments for Gemma-3.
Usage Example
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
base_model = "canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1"
lora_model = "canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage2"
device = "cuda" if torch.cuda.is_available() else "cpu"
tokenizer = AutoTokenizer.from_pretrained(base_model)
model = AutoModelForCausalLM.from_pretrained(
base_model,
torch_dtype=torch.bfloat16
)
model = PeftModel.from_pretrained(model, lora_model)
model = model.to(device)
prompt = "bundan böyle"
inputs = tokenizer(prompt, return_tensors="pt").to(device)
outputs = model.generate(
**inputs,
max_new_tokens=50,
do_sample=True,
temperature=0.8,
top_p=0.9
)
generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
Model tree for canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage2
Base model
google/gemma-3-1b-pt