Model Card: Gemma3-1B Turkish CPT LoRA (1st Epoch – Stage 1, 0K–50K Subset)

Overview

This model is a LoRA-adapted Turkish Continued Pretraining (CPT) variant of Gemma-3-1B.

Unlike the full-parameter CPT models trained in other stages of this project, this model performs parameter-efficient adaptation using Low-Rank Adaptation (LoRA). The base model weights remain frozen and only the LoRA adapter parameters are trained.

The model was trained on the first shard of the Turkish web corpus (samples 0–50,000).

Base model: canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1 Training method: LoRA-based continued pretraining
Dataset shard: 50K-100k samples
Objective: domain adaptation to Turkish web text


Training Setup

Base Model: canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1
Dataset: canbingol/vngrs-web-corpus-200k
Subset Used: Samples 50,000-100,000 Training Objective: Continued Pretraining
Data Regime: Plain text
Epochs: 1
Token Count: ~21.6M tokens


LoRA Configuration

This model was trained using Low-Rank Adaptation (LoRA) with the following configuration.

r = 16
lora_alpha = 32
lora_dropout = 0.05

LoRA adapters were applied to the following transformer modules.

q_proj
k_proj
v_proj
o_proj
gate_proj
up_proj
down_proj

This configuration results in approximately 14.9M trainable parameters, which corresponds to roughly 0.48% of the full model parameters.


Training Notes

Only LoRA adapter weights were updated during training.
The base model parameters remain unchanged.

This model represents the first stage of LoRA-based Turkish CPT experiments for Gemma-3.


Usage Example

import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel

base_model = "canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage1"
lora_model = "canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage2"

device = "cuda" if torch.cuda.is_available() else "cpu"

tokenizer = AutoTokenizer.from_pretrained(base_model)

model = AutoModelForCausalLM.from_pretrained(
    base_model,
    torch_dtype=torch.bfloat16
)

model = PeftModel.from_pretrained(model, lora_model)
model = model.to(device)

prompt = "bundan böyle"
inputs = tokenizer(prompt, return_tensors="pt").to(device)

outputs = model.generate(
    **inputs,
    max_new_tokens=50,
    do_sample=True,
    temperature=0.8,
    top_p=0.9
)

generated_text = tokenizer.decode(outputs[0], skip_special_tokens=True)
print(generated_text)
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage2

Dataset used to train canbingol/gemma3_1B_base-tr-cpt-lora-1st_epoch_stage2