Amazing, some feedback
I always thought MOE models were worse than dense models, but i decided to try this anyway and its really good. I guess i never really understood how they work.
I tried several different quants and the ones below iq4ks arent good, but iq4ks and higher are.
Idk how you do it but your models work best for me out of all the ones i tried. ill leave some feedback for all of them here, to the best of my ability at least. If i kept notes im sure it would be better... but here it is off the top of my head:
GLM steam 1 iq4ks(limited testing) -
Pros- follows prompts/character card/lore books amazingly, remembers small details, coherent, very creative, good response format
Cons- lots of llm slop, likes to repeat what i say though if i edit it out enough it stops repeating me.
Anubis 1.2 q4km(moderate testing)
Pros- better roleplay vs anubis 1.1. Follows lorebooks/character cards/promps(not as well as glm), decent speech/narrative balance, doesnt speak for me
Cons- tends to have shorter replies. I do like shorter replies(~300 tokens)vs storybook replies, but they are a little short.
Cydonia 4.3 q8(extensive testing)
Pros - better roleplay than any other 24b ive tried, to the point that any other 24b is unusable. Doesnt speak for me, remembers small details, great speech/narrarive balance, reads between the lines to know my intentions.
Cons- shys away from intimacy, steers the story to a good and wholeosme one, doesnt seem to follow lorebooks if they are nsfw.
All in all the larger the model the more personality and realism they have, but of course the speed and context i can handle goes down. My pc has 48gb vram and 96gb ram.
Hope this helps in some way.
Anyway, keep up the good work π