ibitato/c64-ministral-3-14b-thinking-c64-reasoning-lora

Overview

This repository contains the LoRA adapter produced by fine-tuning a reasoning-capable Ministral 3 14B model on technical Commodore 64 material.

Objective:

keep the reasoning behavior of the base model,
inject C64-specific technical knowledge,
support practical troubleshooting and low-level explanations (BASIC, KERNAL, memory map, VIC-II, SID, 6502/6510).

Project source code and pipeline:

https://github.com/ibitato/C64_AI_Companion

Related repositories:

Base Model

Base model: mistralai/Ministral-3-14B-Reasoning-2512
Architecture: Mistral3ForConditionalGeneration (language component fine-tuned with LoRA)
Max context length: 262,144 tokens (from text_config.max_position_embeddings)

Training Data (project-local corpus)

DAPT: train=408, validation=27, test=45
SFT: train=1620, validation=204, test=190
Sources: curated Commodore 64 manuals and technical documents from this project.

Training Recipe

Pipeline: DAPT + SFT (LoRA)
Precision: bf16
Max sequence length: 2048
Batch size per device: 1
Gradient accumulation: 16
Learning rate: 2e-05
Epochs: 3.0
Warmup steps: 100
Logging/save/eval steps: 50/500/250
Optimizer: adamw_torch, eval strategy: steps
Gradient checkpointing: False
LoRA: r=16, alpha=32, dropout=0.05, scope=language_qkvo
SFT options: assistant_only_loss=True, packing=False

Run summary:

DAPT checkpoint: checkpoint-78
SFT checkpoint: checkpoint-306
DAPT global steps: 78 / 78
SFT global steps: 306 / 306
Last logged SFT step: 300
Last logged SFT loss: 0.03266857624053955
Last logged SFT token accuracy: 0.9903446585685015
Card generated at (UTC): 2026-03-02T16:47:33.221202+00:00
Source git revision: 13fafe7

Reasoning Validation Snapshot

Validation status: FAIL
Source artifacts: results/reasoning_validation/14b/20260302_152057

Note: contract/format retention passed; failure is due to strict exact-token determinism (hash mismatch across repeated same-seed runs).

Metric	Value
single_think_tag_rate	1.0000
single_balanced_tag_rate	1.0000
single_final_after_think_rate	1.0000
multi_turn_retention_rate	1.0000
format_contract_pass_rate	1.0000
exact_hash_match_rate	0.3403
semantic_similarity_avg	0.9956
crash_or_timeout_rate	0.0000

Usage (Transformers + PEFT)

import torch
from peft import PeftModel
from transformers import AutoTokenizer
from transformers.models.mistral3 import Mistral3ForConditionalGeneration

base_id = "mistralai/Ministral-3-14B-Reasoning-2512"
adapter_id = "ibitato/c64-ministral-3-14b-thinking-c64-reasoning-lora"

tokenizer = AutoTokenizer.from_pretrained(base_id, trust_remote_code=True)
base_model = Mistral3ForConditionalGeneration.from_pretrained(
    base_id,
    torch_dtype=torch.bfloat16,
    trust_remote_code=True,
)
model = PeftModel.from_pretrained(base_model, adapter_id)

prompt = "Explain the C64 SID chip in one concise paragraph."
inputs = tokenizer(prompt, return_tensors="pt")
with torch.no_grad():
    out = model.generate(**inputs, max_new_tokens=128)
print(tokenizer.decode(out[0], skip_special_tokens=True))

Limitations

This adapter is specialized for C64 technical assistance, not for broad benchmark optimization.
No additional safety fine-tuning was applied beyond the base model behavior.
Evaluation in this repo focuses on training diagnostics; downstream benchmarking should be done per target use case.

Downloads last month: 1

Model tree for ibitato/c64-ministral-3-14b-thinking-c64-reasoning-lora

Base model

mistralai/Ministral-3-14B-Base-2512

Finetuned

mistralai/Ministral-3-14B-Reasoning-2512

Adapter

(1)

this model

Quantizations

1 model

Collection including ibitato/c64-ministral-3-14b-thinking-c64-reasoning-lora

C64 Ministral 3 14B Thinking C64 Reasoning

Collection

LoRA and GGUF artifacts for the C64-focused Ministral 3 14B reasoning fine-tune. • 2 items • Updated Mar 2