drewli20200316
/

SFT-EN-01-29-2026

Model card Files Files and versions

Metrics Training metrics Community

YAML Metadata Warning:empty or missing yaml metadata in repo card

Check out the documentation for more information.

SFT English Medical Model - Qwen3-4B

Overview

Base Model: Qwen3-4B
Training: DeepSpeed-Chat SFT with LoRA
Dataset: UltraMedical English (9K train, 1K eval)
Date: 2026-01-29

Training Config

LoRA dim: 64
Learning rate: 2e-5
Batch size: 2
Gradient accumulation: 4
ZeRO stage: 2
Dtype: bf16

Results

Final PPL: 2.498
Final Loss: 0.915

Directory

model/ - SFT model weights
data/ - Training data
scripts/ - Training scripts
code/ - Modified DeepSpeed-Chat code

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support