IEC 62304 Compliance Support Model

A fine-tuned LoRA adapter for Phi-4-mini-instruct specialised in generating IEC 62304-compliant medical device software documentation.

Overview

This model assists with authoring regulatory documentation for medical device software under IEC 62304:2006+AMD1:2015. It handles three core tasks:

  • Prose generation โ€” drafting sections of Software Development Plans (SDP), Software Requirements Specifications (SRS), Software Architecture Descriptions (SAD), FMEA, Threat Models, and Traceability matrices
  • Requirement expansion โ€” expanding brief requirement statements into structured fields (description, rationale, priority, acceptance criterion, source)
  • Compliance review โ€” evaluating documentation excerpts for correctness and issuing PASS/FAIL verdicts with explanations

The model is trained to be class-aware, applying different verification and documentation expectations for IEC 62304 Safety Classes A, B, and C.

Model Details

Parameter Value
Base model microsoft/Phi-4-mini-instruct (3.8B parameters)
Adapter LoRA (r=32, alpha=32)
Target modules q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
Quantisation 4-bit (QLoRA) during training; Q4_K_M GGUF for inference
Context window 512 tokens (training); model supports up to 16K

Training

Dataset

10,302 examples (9,268 train / 1,034 validation), stratified 90/10 split by task type and safety class.

Task Type Count Proportion
Prose generation 4,800 46.6%
Compliance review (reflection) 3,589 34.8%
Requirement expansion 1,800 17.5%
Standards knowledge 113 1.1%

Safety class distribution: Class A 46.7%, Class B 30.6%, Class C 21.6%, General 1.1%

The dataset covers 120 fictional medical device projects spanning embedded firmware, mobile apps, desktop software, web applications, cloud SaaS, hybrid edge+cloud systems, IVD software, and AI/ML SaMD. Examples are parameterised from templates to ensure diversity while maintaining regulatory accuracy.

Quality controls applied during generation:

  • Zero instances of "Per IEC 62304" as a paragraph opening (anti-repetition)
  • Zero references to "IEC 62304-1" or other non-existent sub-parts
  • Zero instances of filler language ("as appropriate", "as needed") in model outputs
  • Class A examples explicitly state that unit and integration testing are not required
  • FAIL verdicts in compliance reviews correctly identify hallucinated standards, wrong-class content, and generic filler

Configuration

Parameter Value
Epochs 7
Batch size 32
Gradient accumulation 2 (effective batch 64)
Learning rate 2e-4 (cosine schedule)
Warmup steps 30
Optimiser AdamW 8-bit
Weight decay 0.01
Sequence packing Enabled (padding-free)
Precision BF16
Hardware NVIDIA RTX A6000 (48 GB)
Training time ~10 hours
Response-only training Enabled (loss computed on assistant tokens only)

Loss Curve

Epoch Train Loss Eval Loss
1 0.448 0.407
2 0.128 0.148
3 0.075 0.092
4 0.051 0.069
5 0.042 0.061
6 0.033 0.058
7 0.027 0.058

Eval loss plateaued at epoch 6, suggesting 7 epochs was appropriate for this dataset size.

Evaluation

Eight inference tests covering the model's target capabilities:

# Test Result
1 Class A SDP โ€” should not mandate unit/integration testing FAIL
2 Class C SRS โ€” safety requirements PASS
3 Requirement expansion โ€” structured field output FAIL
4 Compliance review โ€” detect Class A violation FAIL
5 Class B SAD โ€” architecture description PASS
6 Class A testing โ€” should not mandate unit/integration testing FAIL
7 Standards knowledge โ€” detect hallucinated "IEC 62304-2" PASS
8 Standards knowledge โ€” confirm correct standard citations PASS

4/8 tests passed.

Observations

Strengths:

  • Standards knowledge is solid. The model correctly identifies hallucinated standard references (e.g. "IEC 62304-2:2015") and validates correct citations (IEC 62304:2006+AMD1:2015, ISO 14971:2019, ISO 13485:2016).
  • Prose generation for Class B and C produces well-structured, relevant content with appropriate clause references.
  • Safety requirement generation for Class C is accurate and specific to the device context.

Known limitations:

  • The model does not reliably suppress unit and integration testing recommendations for Class A devices. IEC 62304 only requires system-level verification for Class A, but the base model's general software engineering knowledge tends to override the fine-tuned behaviour. This is the expected hardest failure mode โ€” the model must learn to omit standard good practices when the regulatory framework does not require them.
  • Requirement expansion sometimes outputs prose paragraphs rather than the expected structured fields (Description, Rationale, Priority, Acceptance Criterion, Source).
  • The compliance review task does not reliably detect Class A violations where unit testing is incorrectly mandated.

These limitations reflect the challenge of overriding strong priors in the base model with domain-specific regulatory rules. For production use, Class A verification strategy sections should be reviewed by a regulatory specialist.

Files

lora_v3/                          # LoRA adapter (apply on top of base model)
  adapter_config.json
  adapter_model.safetensors
  tokenizer files

phi-4-mini-instruct.Q4_K_M.gguf  # Full merged model, quantised Q4_K_M for local inference

training_data_v3/                 # Training artifacts
  train.jsonl                     # 9,268 training examples
  val.jsonl                       # 1,034 validation examples
  training_summary.json           # Hyperparameters and loss curve
  trainer_state.json              # Full training log (every 10 steps)
  inference_report_v3.json        # Detailed evaluation results with model responses

Weights

The full unquantised base model is available at microsoft/Phi-4-mini-instruct on Hugging Face. To obtain the full fine-tuned model at original precision, load the base model and apply the LoRA adapter from lora_v3/. For a self-contained quantised model ready for local inference, use the GGUF file directly.

Usage

With Unsloth / Transformers

from unsloth import FastLanguageModel

model, tokenizer = FastLanguageModel.from_pretrained(
    model_name="cpiuk/htech_compliance",
    adapter_name="lora_v3",
    max_seq_length=2048,
    load_in_4bit=True,
)
FastLanguageModel.for_inference(model)

messages = [
    {"role": "system", "content": "You are an IEC 62304 regulatory documentation expert."},
    {"role": "user", "content": "Write a Software Development Plan section on verification strategy for a Class B patient monitoring system."},
]

inputs = tokenizer.apply_chat_template(messages, tokenize=True, add_generation_prompt=True, return_tensors="pt").to("cuda")
outputs = model.generate(input_ids=inputs, max_new_tokens=512, temperature=0.3, top_p=0.9)
print(tokenizer.decode(outputs[0][inputs.shape[1]:], skip_special_tokens=True))

With GGUF (llama.cpp / Ollama)

Download phi-4-mini-instruct.Q4_K_M.gguf and use with any llama.cpp-compatible runtime:

ollama create iec62304 -f Modelfile
ollama run iec62304 "Write safety requirements for a Class C infusion pump controller."

Citation

If referencing this work, please cite:

IEC 62304 Compliance Support Model
CPI (UK) โ€” https://uk-cpi.com
Fine-tuned LoRA adapter for Phi-4-mini-instruct
Dataset: 10,302 IEC 62304 regulatory documentation examples

Disclaimer

This model is a development aid and does not replace qualified regulatory expertise. All generated documentation must be reviewed by appropriately qualified personnel before submission to regulatory bodies. The model may produce content that is incorrect, incomplete, or non-compliant. Users are responsible for verifying all outputs against the applicable version of IEC 62304 and relevant national regulations.

Downloads last month
8
GGUF
Model size
4B params
Architecture
phi3
Hardware compatibility
Log In to add your hardware

4-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for cpiuk/htech_compliance

Adapter
(171)
this model