nanowhale-100m-base / configuration_deepseek_v4.py

Commit History

Upload SmolDeepSeek-V4 100M pretrained model (5000 steps on FineWeb-Edu)
6e9a78e
verified

cmpatino HF Staff commited on