cat > README.md << 'README' --- language: en license: apache-2.0 tags: - image-classification - medical - dermatology - skin-disease - ensemble datasets: - merolavtechnology/dermnet-skin40-cleaned-dataset metrics: - accuracy - f1 --- # DermNet-Skin23 — ConvNeXt-V1-XL @ 384 ConvNeXt-V1-XL fine-tuned on a 23-class consolidation of DermNet + Skin40, paired with [iamcode6/dermnet-skin23-eva02](https://huggingface.co/iamcode6/dermnet-skin23-eva02) for cross-architecture ensembling. ## Results Single best (EMA): 80.47% acc / 0.7843 macro F1. 5-model cross-architecture ensemble (2× EVA-02-L + 3× ConvNeXt-V1-XL) with 4-aug TTA: **82.86% acc / 0.8113 macro F1**. ## Dataset Source: merolavtechnology/dermnet-skin40-cleaned-dataset on Kaggle. The 40 fine-grained Skin40 categories were consolidated into 23 broader Dermnet buckets. Final: 17,557 train / 3,856 test. ## Training - Hardware: AMD Instinct MI300X (192 GB HBM3), ROCm 7.0 - Backbone: convnext_xlarge.fb_in22k_ft_in1k_384 (~350M params) - 25 epochs, batch 64, AdamW, cosine LR with 10% warmup, peak LR=1.1e-4 - Mixup α=0.1 + Cutmix α=0.5 at prob=0.5; off in last 20% of epochs - WeightedRandomSampler with effective-number weights - EMA decay=0.999, SWA over last 20%, bf16 autocast ## Notes ConvNeXt V2-Huge was tried first but is bf16-unstable on long runs (GRN issue) — V1-XL is the reliable choice. EMA decay of 0.9999 was too slow for a 25-epoch fine-tune from a fresh head; 0.999 fixes it. README