See axolotl config

axolotl version: 0.16.0.dev0

# 1. Base Model & Tokenizer
base_model: google/gemma-2-2b-it  # Or your preferred 2B model
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
hub_model_id: AiAF/combined-70-30-rp-sft-qlora
hub_strategy: checkpoint
# 2. LoRA / QLoRA Configuration
load_in_4bit: true
adapter: qlora
lora_r: 64
lora_alpha: 128
lora_dropout: 0.05
lora_target_linear: true
# 3. Dataset Configuration
streaming: false
#streaming_multipack_buffer_size: 5000
#sample_packing: true
datasets:
  - path: .
    data_files: combined_70_30_shuffled.jsonl
    type: chat_template
    split: train
    field_messages: conversations
    message_property_mappings:
      role: from
      content: value
    chat_template: jinja
    chat_template_jinja: |
      {{ bos_token }}
      {% for m in messages %}
        {% set role = 'model' if m['role']=='assistant' else 'user' %}
        {{ '<start_of_turn>' + role + '\n' + m['content'] | trim + '<end_of_turn>\n' }}
      {% endfor %}
      {% if add_generation_prompt %}
      {{ '<start_of_turn>model\n' }}
      {% endif %}
    roles_to_train: ["assistant"]
    train_on_eos: "turn"
# Small eval set (use a slice of your data)
test_datasets:
  - path: .
    name: json
    type: chat_template
    data_files: eval_1000.jsonl
    split: train
    field_messages: conversations
    message_property_mappings:
      role: from
      content: value
    chat_template: jinja
    chat_template_jinja: |
      {{ bos_token }}
      {% for m in messages %}
        {% set role = 'model' if m['role']=='assistant' else 'user' %}
        {{ '<start_of_turn>' + role + '\n' + m['content'] | trim + '<end_of_turn>\n' }}
      {% endfor %}
      {% if add_generation_prompt %}
      {{ '<start_of_turn>model\n' }}
      {% endif %}
    roles_to_train: ["assistant"]
# 4. Training Parameters
sequence_len: 2048
sample_packing: true
eval_sample_packing: true
max_steps: 1500  # ~2-3 epochs on 12K samples
dataset_prepared_path: last_run_prepared
# 5. Saving and Evaluation Strategy
evaluation_strategy: steps
save_strategy: steps
eval_steps: 100
save_steps: 100
save_total_limit: 20
# 6. Output & Logging
output_dir: /workspace/data/axolotl-outputs/sft/combined-70-30-rp-sft-qlora
wandb_project: "rp-sft"
wandb_name: "combined-70-30-gemma-2b"
wandb_log_model: "false"
# 7. Batching & Optimizer
gradient_accumulation_steps: 4
micro_batch_size: 2
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002
weight_decay: 0.0
# 8. Hardware & Performance
bf16: true
tf32: true
flash_attention: true
gradient_checkpointing: true
logging_steps: 1
# 9. Special Tokens
special_tokens:
  bos_token: "<bos>"
  eos_token: "<eos>"
  pad_token: "<pad>"
eot_tokens: ["<end_of_turn>"]

combined-70-30-rp-sft-qlora

This model is a fine-tuned version of google/gemma-2-2b-it on an unknown dataset. It achieves the following results on the evaluation set:

Loss: 0.0001
Ppl: 1.0001
Memory/max Active (gib): 12.39
Memory/max Allocated (gib): 12.39
Memory/device Reserved (gib): 20.44

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

learning_rate: 0.0002
train_batch_size: 2
eval_batch_size: 2
seed: 42
gradient_accumulation_steps: 4
total_train_batch_size: 8
optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
lr_scheduler_type: cosine
training_steps: 1500

Training results

Training Loss	Epoch	Step	Validation Loss	Ppl	Active (gib)	Allocated (gib)	Reserved (gib)
No log	0	0	1.8564	6.4006	12.22	12.22	18.19
0.0957	4.7711	100	0.0876	1.0916	12.39	12.39	22.39
0.0084	9.5301	200	0.0062	1.0063	12.39	12.39	20.44
0.0116	14.2892	300	0.0033	1.0033	12.39	12.39	20.44
0.0011	19.0482	400	0.0023	1.0023	12.39	12.39	20.44
0.0016	23.8193	500	0.0038	1.0038	12.39	12.39	20.44
0.0008	28.5783	600	0.0015	1.0015	12.39	12.39	20.44
0.0004	33.3373	700	0.0005	1.0006	12.39	12.39	20.44
0.0003	38.0964	800	0.0003	1.0004	12.39	12.39	20.44
0.0004	42.8675	900	0.0002	1.0002	12.39	12.39	20.44
0.0002	47.6265	1000	0.0001	1.0001	12.39	12.39	20.44
0.0001	52.3855	1100	0.0001	1.0001	12.39	12.39	20.44
0.0001	57.1446	1200	0.0001	1.0001	12.39	12.39	20.44
0.0002	61.9157	1300	0.0001	1.0001	12.39	12.39	20.44
0.0001	66.6747	1400	0.0001	1.0001	12.39	12.39	20.44
0.0001	71.4337	1500	0.0001	1.0001	12.39	12.39	20.44

Framework versions

PEFT 0.18.1
Transformers 5.3.0
Pytorch 2.9.1+cu128
Datasets 4.5.0
Tokenizers 0.22.2

Downloads last month: 36

Model tree for AiAF/combined-70-30-rp-sft-qlora

Base model

google/gemma-2-2b

Finetuned

google/gemma-2-2b-it

Adapter

(428)

this model