SmolLM Torrent Metadata Extractor

Fine-tuned SmolLM-360M-Instruct for extracting structured metadata from torrent names.

Model Description

This model extracts title, artist, and year information from torrent filenames for audio and video content.

Input Format

<|im_start|>user
<|extract|>[content_type] torrent_name<|im_end|>
<|im_start|>assistant

Content Types

audio/album → outputs: album | artist | year
audio/track → outputs: track | artist | year
video/movie → outputs: title | year
video/episode → outputs: series_title
video/season → outputs: series_title
video/series → outputs: series_title

Examples

Input: <|extract|>[audio/album] Pink Floyd - The Dark Side of the Moon (1973) [FLAC] Output: The Dark Side of the Moon | Pink Floyd | 1973

Input: <|extract|>[video/movie] The.Matrix.1999.1080p.BluRay.x264-GROUP Output: The Matrix | 1999

Training

Base model: SmolLM-360M-Instruct
Method: LoRA fine-tuning (r=16, alpha=32)
Training data: ~200k samples from Spotify catalog validation
Checkpoint: 11500 steps

Files

File	Size	Description
`smollm-f32.gguf`	1.4GB	Full precision GGUF
`smollm-q4_k_m.gguf`	259MB	Q4_K_M quantized (recommended for inference)

Usage with llama.cpp

./llama-cli -m smollm-q4_k_m.gguf -p "<|im_start|>user
<|extract|>[audio/album] Radiohead - OK Computer (1997) [FLAC]<|im_end|}
<|im_start|>assistant
" -n 32

License

MIT

Downloads last month: 8

GGUF

Model size

0.4B params

Architecture

llama

Hardware compatibility

4-bit

32-bit

Model tree for lelloman/smollm-torrent-metadata

Base model

HuggingFaceTB/SmolLM-360M

Quantized

HuggingFaceTB/SmolLM-360M-Instruct

Quantized

(23)

this model