ViL-DLM-0.6B / README.md

Commit History

Add timestep-aware sparse KD weighting
25e4efd

omar-ah commited on

Force cached KD masks during Stage 3b
9defebb

omar-ah commited on

Guarantee masked assistant tokens in diffusion training
02b453d

omar-ah commited on

Implement stage-aware real-run training pipeline
0d77b0a

omar-ah commited on

Add comprehensive README with architecture details and training recipe
1e0c38c
verified

omar-ah commited on