GQA?
#1
by ct-2 - opened
Hello, do the 7B, 13B, and 65B models use grouped query attention? Thanks!
No, we haven't used GQA.
ChloeAuYeung changed discussion status to closed
I see, thank you for the reply!
Hello, do the 7B, 13B, and 65B models use grouped query attention? Thanks!
No, we haven't used GQA.
I see, thank you for the reply!