oktoscript / CHANGELOG_V1.1.md
OktoSeek's picture
Upload 48 files
5fc8c9d verified

OktoScript v1.1 Changelog

Release Date: November 2025
Status: 100% Backward Compatible with v1.0


๐ŸŽ‰ New Features

1. LoRA Fine-Tuning Support

Added FT_LORA block for efficient fine-tuning using Low-Rank Adaptation adapters.

Benefits:

  • โœ… Reduced memory footprint (up to 90% less VRAM)
  • โœ… Faster training times
  • โœ… Smaller model files (only adapter weights)
  • โœ… Easy to combine multiple LoRA adapters

Example:

# okto_version: "1.1"
FT_LORA {
    base_model: "oktoseek/base-llm-7b"
    lora_rank: 8
    lora_alpha: 32
    epochs: 5
    batch_size: 4
    learning_rate: 0.00003
    device: "cuda"
    target_modules: ["q_proj", "v_proj"]
}

See: examples/lora-finetuning.okt


2. Dataset Mixing and Sampling

Enhanced DATASET block with support for mixing multiple datasets with weighted sampling.

New Fields:

  • mix_datasets: Array of {path, weight} objects
  • dataset_percent: Limit dataset usage (1-100)
  • sampling: "weighted" or "random"
  • shuffle: Shuffle datasets before mixing

Example:

DATASET {
    mix_datasets: [
        { path: "dataset/base.jsonl", weight: 70 },
        { path: "dataset/extra.jsonl", weight: 30 }
    ]
    dataset_percent: 80
    sampling: "weighted"
    shuffle: true
}

Benefits:

  • โœ… Combine multiple datasets intelligently
  • โœ… Control dataset proportions
  • โœ… Limit dataset size for faster iteration
  • โœ… Weighted or random sampling strategies

See: examples/dataset-mixing.okt


3. Advanced System Monitoring

Added MONITOR block for comprehensive system and training telemetry.

Features:

  • System metrics (GPU, CPU, RAM, temperature)
  • Training speed metrics (tokens/s, samples/s)
  • Real-time dashboard (optional)
  • Configurable refresh intervals
  • Export to JSON

Example:

MONITOR {
    level: "full"
    log_metrics: ["loss", "accuracy", "perplexity"]
    log_system: ["gpu_memory_used", "cpu_usage", "temperature"]
    log_speed: ["tokens_per_second", "samples_per_second"]
    refresh_interval: 2s
    export_to: "runs/logs/system.json"
    dashboard: true
}

Benefits:

  • โœ… Monitor system resources during training
  • โœ… Detect bottlenecks and optimize
  • โœ… Track training speed
  • โœ… Real-time visualization

4. Version Declaration

Added optional version declaration at the top of .okt files.

Syntax:

# okto_version: "1.1"
PROJECT "MyModel"
...

Rules:

  • Optional (defaults to v1.0 if missing)
  • Must be first line (comments allowed before)
  • Format: # okto_version: "1.1" or # okto_version: "1.0"
  • Enables v1.1 features when set to "1.1"

๐Ÿ“ New Optional Folders

v1.1 introduces optional folders for new features:

/runs/
  โ””โ”€โ”€ my-model/
      โ”œโ”€โ”€ logs/
      โ”‚   โ””โ”€โ”€ system.json      # MONITOR output
      โ””โ”€โ”€ lora/                 # LoRA adapters
          โ””โ”€โ”€ adapter.safetensors

Note: These folders are created automatically when using v1.1 features. Existing v1.0 structure remains unchanged.


๐Ÿ”„ Backward Compatibility

100% Compatible with v1.0:

  • โœ… All v1.0 files work without modification
  • โœ… v1.0 syntax remains valid
  • โœ… No breaking changes
  • โœ… Default version is v1.0 (if version not specified)

Migration:

  • No migration required
  • Simply add # okto_version: "1.1" to use new features
  • Existing v1.0 files continue to work

๐Ÿ“š Documentation Updates


๐Ÿ› Bug Fixes

None (this is a feature release)


๐Ÿ”ฎ Future Roadmap

Planned for future versions:

  • Multi-GPU training support
  • Distributed training
  • Advanced quantization options
  • More dataset formats
  • Custom loss functions

For questions or feedback: GitHub Issues