cat > README.md << 'README'
  ---
  language: en
  license: apache-2.0
  tags:
  - image-classification
  - medical
  - dermatology
  - skin-disease
  - ensemble
  datasets:
  - merolavtechnology/dermnet-skin40-cleaned-dataset
  metrics:
  - accuracy
  - f1
  ---

  # DermNet-Skin23 — ConvNeXt-V1-XL @ 384

  ConvNeXt-V1-XL fine-tuned on a 23-class consolidation of DermNet + Skin40, paired with
  [iamcode6/dermnet-skin23-eva02](https://huggingface.co/iamcode6/dermnet-skin23-eva02) for cross-architecture ensembling.

  ## Results

  Single best (EMA): 80.47% acc / 0.7843 macro F1.

  5-model cross-architecture ensemble (2× EVA-02-L + 3× ConvNeXt-V1-XL) with 4-aug TTA: **82.86% acc / 0.8113 macro F1**.

  ## Dataset

  Source: merolavtechnology/dermnet-skin40-cleaned-dataset on Kaggle. The 40 fine-grained Skin40 categories were consolidated into 23 broader Dermnet buckets.
  Final: 17,557 train / 3,856 test.

  ## Training

  - Hardware: AMD Instinct MI300X (192 GB HBM3), ROCm 7.0
  - Backbone: convnext_xlarge.fb_in22k_ft_in1k_384 (~350M params)
  - 25 epochs, batch 64, AdamW, cosine LR with 10% warmup, peak LR=1.1e-4
  - Mixup α=0.1 + Cutmix α=0.5 at prob=0.5; off in last 20% of epochs
  - WeightedRandomSampler with effective-number weights
  - EMA decay=0.999, SWA over last 20%, bf16 autocast

  ## Notes

  ConvNeXt V2-Huge was tried first but is bf16-unstable on long runs (GRN issue) — V1-XL is the reliable choice. EMA decay of 0.9999 was too slow for a 25-epoch
  fine-tune from a fresh head; 0.999 fixes it.
  README