Built with Axolotl

See axolotl config

axolotl version: 0.16.0.dev0

# 1. Base Model & Tokenizer
base_model: google/gemma-2-2b-it  # Or your preferred 2B model
model_type: AutoModelForCausalLM
tokenizer_type: AutoTokenizer
hub_model_id: AiAF/combined-70-30-rp-sft-qlora
hub_strategy: checkpoint
# 2. LoRA / QLoRA Configuration
load_in_4bit: true
adapter: qlora
lora_r: 64
lora_alpha: 128
lora_dropout: 0.05
lora_target_linear: true
# 3. Dataset Configuration
streaming: false
#streaming_multipack_buffer_size: 5000
#sample_packing: true
datasets:
  - path: .
    data_files: combined_70_30_shuffled.jsonl
    type: chat_template
    split: train
    field_messages: conversations
    message_property_mappings:
      role: from
      content: value
    chat_template: jinja
    chat_template_jinja: |
      {{ bos_token }}
      {% for m in messages %}
        {% set role = 'model' if m['role']=='assistant' else 'user' %}
        {{ '<start_of_turn>' + role + '\n' + m['content'] | trim + '<end_of_turn>\n' }}
      {% endfor %}
      {% if add_generation_prompt %}
      {{ '<start_of_turn>model\n' }}
      {% endif %}
    roles_to_train: ["assistant"]
    train_on_eos: "turn"
# Small eval set (use a slice of your data)
test_datasets:
  - path: .
    name: json
    type: chat_template
    data_files: eval_1000.jsonl
    split: train
    field_messages: conversations
    message_property_mappings:
      role: from
      content: value
    chat_template: jinja
    chat_template_jinja: |
      {{ bos_token }}
      {% for m in messages %}
        {% set role = 'model' if m['role']=='assistant' else 'user' %}
        {{ '<start_of_turn>' + role + '\n' + m['content'] | trim + '<end_of_turn>\n' }}
      {% endfor %}
      {% if add_generation_prompt %}
      {{ '<start_of_turn>model\n' }}
      {% endif %}
    roles_to_train: ["assistant"]
# 4. Training Parameters
sequence_len: 2048
sample_packing: true
eval_sample_packing: true
max_steps: 1500  # ~2-3 epochs on 12K samples
dataset_prepared_path: last_run_prepared
# 5. Saving and Evaluation Strategy
evaluation_strategy: steps
save_strategy: steps
eval_steps: 100
save_steps: 100
save_total_limit: 20
# 6. Output & Logging
output_dir: /workspace/data/axolotl-outputs/sft/combined-70-30-rp-sft-qlora
wandb_project: "rp-sft"
wandb_name: "combined-70-30-gemma-2b"
wandb_log_model: "false"
# 7. Batching & Optimizer
gradient_accumulation_steps: 4
micro_batch_size: 2
optimizer: adamw_bnb_8bit
lr_scheduler: cosine
learning_rate: 0.0002
weight_decay: 0.0
# 8. Hardware & Performance
bf16: true
tf32: true
flash_attention: true
gradient_checkpointing: true
logging_steps: 1
# 9. Special Tokens
special_tokens:
  bos_token: "<bos>"
  eos_token: "<eos>"
  pad_token: "<pad>"
eot_tokens: ["<end_of_turn>"]

combined-70-30-rp-sft-qlora

This model is a fine-tuned version of google/gemma-2-2b-it on an unknown dataset. It achieves the following results on the evaluation set:

  • Loss: 0.0001
  • Ppl: 1.0001
  • Memory/max Active (gib): 12.39
  • Memory/max Allocated (gib): 12.39
  • Memory/device Reserved (gib): 20.44

Model description

More information needed

Intended uses & limitations

More information needed

Training and evaluation data

More information needed

Training procedure

Training hyperparameters

The following hyperparameters were used during training:

  • learning_rate: 0.0002
  • train_batch_size: 2
  • eval_batch_size: 2
  • seed: 42
  • gradient_accumulation_steps: 4
  • total_train_batch_size: 8
  • optimizer: Use OptimizerNames.ADAMW_BNB with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 1500

Training results

Training Loss Epoch Step Validation Loss Ppl Active (gib) Allocated (gib) Reserved (gib)
No log 0 0 1.8564 6.4006 12.22 12.22 18.19
0.0957 4.7711 100 0.0876 1.0916 12.39 12.39 22.39
0.0084 9.5301 200 0.0062 1.0063 12.39 12.39 20.44
0.0116 14.2892 300 0.0033 1.0033 12.39 12.39 20.44
0.0011 19.0482 400 0.0023 1.0023 12.39 12.39 20.44
0.0016 23.8193 500 0.0038 1.0038 12.39 12.39 20.44
0.0008 28.5783 600 0.0015 1.0015 12.39 12.39 20.44
0.0004 33.3373 700 0.0005 1.0006 12.39 12.39 20.44
0.0003 38.0964 800 0.0003 1.0004 12.39 12.39 20.44
0.0004 42.8675 900 0.0002 1.0002 12.39 12.39 20.44
0.0002 47.6265 1000 0.0001 1.0001 12.39 12.39 20.44
0.0001 52.3855 1100 0.0001 1.0001 12.39 12.39 20.44
0.0001 57.1446 1200 0.0001 1.0001 12.39 12.39 20.44
0.0002 61.9157 1300 0.0001 1.0001 12.39 12.39 20.44
0.0001 66.6747 1400 0.0001 1.0001 12.39 12.39 20.44
0.0001 71.4337 1500 0.0001 1.0001 12.39 12.39 20.44

Framework versions

  • PEFT 0.18.1
  • Transformers 5.3.0
  • Pytorch 2.9.1+cu128
  • Datasets 4.5.0
  • Tokenizers 0.22.2
Downloads last month
36
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for AiAF/combined-70-30-rp-sft-qlora

Adapter
(428)
this model