Distilling 100B+ Models 40x Faster with TRL
๐
5
TRL distillation for 100B+ teachers, 40x faster
Exploring smol models (for text, vision and video) and high quality web and synthetic datasets
pip install --upgrade trlHuggingFaceM4/FineVision_full_shuffled