ibitato/c64-ministral-3-14b-thinking-c64-reasoning-lora

Overview

This repository contains the LoRA adapter produced by fine-tuning a reasoning-capable Ministral 3 14B model on technical Commodore 64 material.

Objective:

  • keep the reasoning behavior of the base model,
  • inject C64-specific technical knowledge,
  • support practical troubleshooting and low-level explanations (BASIC, KERNAL, memory map, VIC-II, SID, 6502/6510).

Project source code and pipeline:

Related repositories:

Base Model

  • Base model: mistralai/Ministral-3-14B-Reasoning-2512
  • Architecture: Mistral3ForConditionalGeneration (language component fine-tuned with LoRA)
  • Max context length: 262,144 tokens (from text_config.max_position_embeddings)

Training Data (project-local corpus)

  • DAPT: train=408, validation=27, test=45
  • SFT: train=1620, validation=204, test=190
  • Sources: curated Commodore 64 manuals and technical documents from this project.

Training Recipe

  • Pipeline: DAPT + SFT (LoRA)
  • Precision: bf16
  • Max sequence length: 2048
  • Batch size per device: 1
  • Gradient accumulation: 16
  • Learning rate: 2e-05
  • Epochs: 3.0
  • Warmup steps: 100
  • Logging/save/eval steps: 50/500/250
  • Optimizer: adamw_torch, eval strategy: steps
  • Gradient checkpointing: False
  • LoRA: r=16, alpha=32, dropout=0.05, scope=language_qkvo
  • SFT options: assistant_only_loss=True, packing=False

Run summary:

  • DAPT checkpoint: checkpoint-78
  • SFT checkpoint: checkpoint-306
  • DAPT global steps: 78 / 78
  • SFT global steps: 306 / 306
  • Last logged SFT step: 300
  • Last logged SFT loss: 0.03266857624053955
  • Last logged SFT token accuracy: 0.9903446585685015
  • Card generated at (UTC): 2026-03-02T16:47:33.221202+00:00
  • Source git revision: 13fafe7

Reasoning Validation Snapshot

  • Validation status: FAIL

  • Source artifacts: results/reasoning_validation/14b/20260302_152057

  • Note: contract/format retention passed; failure is due to strict exact-token determinism (hash mismatch across repeated same-seed runs).

    Metric Value
    single_think_tag_rate 1.0000
    single_balanced_tag_rate 1.0000
    single_final_after_think_rate 1.0000
    multi_turn_retention_rate 1.0000
    format_contract_pass_rate 1.0000
    exact_hash_match_rate 0.3403
    semantic_similarity_avg 0.9956
    crash_or_timeout_rate 0.0000

Usage (Transformers + PEFT)

import torch
from peft import PeftModel
from transformers import AutoTokenizer
from transformers.models.mistral3 import Mistral3ForConditionalGeneration

base_id = "mistralai/Ministral-3-14B-Reasoning-2512"
adapter_id = "ibitato/c64-ministral-3-14b-thinking-c64-reasoning-lora"

tokenizer = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
base_model = Mistral3ForConditionalGeneration.from_pretrained(
    base_id,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_id)

prompt = "Explain the C64 SID chip in one concise paragraph."
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Limitations

  • This adapter is specialized for C64 technical assistance, not for broad benchmark optimization.
  • No additional safety fine-tuning was applied beyond the base model behavior.
  • Evaluation in this repo focuses on training diagnostics; downstream benchmarking should be done per target use case.
Downloads last month
1
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ibitato/c64-ministral-3-14b-thinking-c64-reasoning-lora

Adapter
(1)
this model
Quantizations
1 model

Collection including ibitato/c64-ministral-3-14b-thinking-c64-reasoning-lora