·
AI & ML interests
I like to fine-tune the small models of the Doge series.
Organizations
view article Trainable Dynamic Mask Sparse Attention: Bridging Efficiency and Effectiveness in Long-Context Language Models
upvoted a paper 9 months ago upvoted a paper over 1 year ago