HypeNet

chen-yingfa 's Collections

updated 11 days ago

The models for the paper: Hybrid Linear Attention Done Right: Efficient Distillation and Effective Architectures for Extremely Long Contexts