SeerAttention-Qwen3-8B-AttnGates

Experimental SeerAttention AttnGates for Qwen3-8B. Trained on RedPajama-1T-Sample Dataset.

Downloads last month
564
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for jiwonsong/SeerAttention-Qwen3-8B-AttnGates

Finetuned
Qwen/Qwen3-8B
Adapter
(1071)
this model