AST Fine-Tuned on ESC-50 (Fold 1)

This checkpoint is an AST audio classifier fine-tuned on ESC-50 from the base model MIT/ast-finetuned-audioset-10-10-0.4593 (took the feature extractor and replaced the classification head to match ESC-50 dataset).

Training Setup

Dataset: ESC-50
Split protocol: Fold 1 as test, Fold 2 as validation, remaining folds as train
Base model: MIT/ast-finetuned-audioset-10-10-0.4593
Epochs: 30
Batch size: 32
Learning rate: 5e-5
Weight decay: 1e-4
Warmup ratio: 0.1
Seed: 42

Results

Best validation accuracy: 0.9675 (epoch 7)
Test accuracy (fold 1): 0.9350
Test loss (fold 1): 0.2708

Usage

from transformers import AutoFeatureExtractor, ASTForAudioClassification

repo_id = "Adam-ousse/ast-esc50-finetuned-fold1"
feature_extractor = AutoFeatureExtractor.from_pretrained(repo_id)
model = ASTForAudioClassification.from_pretrained(repo_id)
model.eval()

Notes

This checkpoint corresponds to one fold setup (test fold 1).
For publication-grade reporting, train and report all 5 folds and provide mean accuracy.

Downloads last month: 148

Safetensors

Model size

86.2M params

Tensor type

F32

Model tree for Adam-ousse/ast-esc50-finetuned-fold1

Base model

MIT/ast-finetuned-audioset-10-10-0.4593

Finetuned

(174)

this model