AST Fine-Tuned on ESC-50 (Fold 1)

This checkpoint is an AST audio classifier fine-tuned on ESC-50 from the base model MIT/ast-finetuned-audioset-10-10-0.4593 (took the feature extractor and replaced the classification head to match ESC-50 dataset).

Training Setup

  • Dataset: ESC-50
  • Split protocol: Fold 1 as test, Fold 2 as validation, remaining folds as train
  • Base model: MIT/ast-finetuned-audioset-10-10-0.4593
  • Epochs: 30
  • Batch size: 32
  • Learning rate: 5e-5
  • Weight decay: 1e-4
  • Warmup ratio: 0.1
  • Seed: 42

Results

  • Best validation accuracy: 0.9675 (epoch 7)
  • Test accuracy (fold 1): 0.9350
  • Test loss (fold 1): 0.2708

Usage

from transformers import AutoFeatureExtractor, ASTForAudioClassification

repo_id = "Adam-ousse/ast-esc50-finetuned-fold1"
feature_extractor = AutoFeatureExtractor.from_pretrained(repo_id)
model = ASTForAudioClassification.from_pretrained(repo_id)
model.eval()

Notes

  • This checkpoint corresponds to one fold setup (test fold 1).
  • For publication-grade reporting, train and report all 5 folds and provide mean accuracy.
Downloads last month
148
Safetensors
Model size
86.2M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Adam-ousse/ast-esc50-finetuned-fold1

Finetuned
(174)
this model