ibitato/c64-ministral-3-8b-thinking-c64-reasoning-lora

Overview

This repository contains the LoRA adapter produced by fine-tuning a reasoning-capable Ministral 3 8B model on technical Commodore 64 material.

Objective:

keep the reasoning behavior of the base model,
inject C64-specific technical knowledge,
support practical troubleshooting and low-level explanations (BASIC, KERNAL, memory map, VIC-II, SID, 6502/6510).

Project source code and pipeline:

https://github.com/ibitato/C64_AI_Companion

Related repositories:

Base Model

Base model: mistralai/Ministral-3-8B-Reasoning-2512
Architecture: Mistral3ForConditionalGeneration (language component fine-tuned with LoRA)
Max context length: 262,144 tokens (from text_config.max_position_embeddings)

Training Data (project-local corpus)

DAPT: train=408, validation=27, test=45
SFT: train=1620, validation=204, test=190
Sources: curated Commodore 64 manuals and technical documents from this project.

Training Recipe

Pipeline: DAPT + SFT (LoRA)
Precision: bf16
Max sequence length: 2048
Batch size per device: 2
Gradient accumulation: 16
Learning rate: 2e-05
Epochs: 3.0
Warmup steps: 100
Logging/save/eval steps: 50/500/250
Optimizer: adamw_torch, eval strategy: steps
Gradient checkpointing: False
LoRA: r=16, alpha=32, dropout=0.05, scope=language_qkvo
SFT options: assistant_only_loss=True, packing=False

Run summary:

DAPT checkpoint: checkpoint-39
SFT checkpoint: checkpoint-153
DAPT global steps: 39 / 39
SFT global steps: 153 / 153
Last logged SFT step: 150
Last logged SFT loss: 0.07340381622314453
Last logged SFT token accuracy: 0.9800586728965606
Card generated at (UTC): 2026-03-02T16:47:20.093292+00:00
Source git revision: 13fafe7

Reasoning Validation Snapshot

Validation status: PASS
Source artifacts: results/reasoning_validation/8b/20260302_151302

Metric	Value
single_think_tag_rate	1.0000
single_balanced_tag_rate	1.0000
single_final_after_think_rate	1.0000
multi_turn_retention_rate	1.0000
format_contract_pass_rate	1.0000
exact_hash_match_rate	1.0000
semantic_similarity_avg	1.0000
crash_or_timeout_rate	0.0000

Usage (Transformers + PEFT)

import torch
from peft import PeftModel
from transformers import AutoTokenizer
from transformers.models.mistral3 import Mistral3ForConditionalGeneration

base_id = "mistralai/Ministral-3-8B-Reasoning-2512"
adapter_id = "ibitato/c64-ministral-3-8b-thinking-c64-reasoning-lora"

tokenizer = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
base_model = Mistral3ForConditionalGeneration.from_pretrained(
    base_id,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_id)

prompt = "Explain the C64 SID chip in one concise paragraph."
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Limitations

This adapter is specialized for C64 technical assistance, not for broad benchmark optimization.
No additional safety fine-tuning was applied beyond the base model behavior.
Evaluation in this repo focuses on training diagnostics; downstream benchmarking should be done per target use case.

Downloads last month: 2

Model tree for ibitato/c64-ministral-3-8b-thinking-c64-reasoning-lora

Base model

mistralai/Ministral-3-8B-Base-2512

Finetuned

mistralai/Ministral-3-8B-Reasoning-2512

Adapter

(2)

this model

Quantizations

1 model

Collection including ibitato/c64-ministral-3-8b-thinking-c64-reasoning-lora

C64 Ministral 3 8B Thinking C64 Reasoning

Collection

Reasoning-capable Ministral 3 8B fine-tuning outputs specialized in Commodore 64 technical knowledge. • 2 items • Updated Feb 24 • 2