Game model

#1
by ncsta - opened
This comment has been hidden (marked as Resolved)

sure this sounds interesting, can you elaborate a little more on what you are doing?
What is the parameter count you want to target for the 15mb model?
How does the character architecture work? Are the character card/system prompts tbaked into the model or is it injected during inference?
I'm also curious what inference speed you're getting on CPU, and what device did you run it?
I have a private NPC eval dataset that I would be willing to test if your are interested in seeing the model's performance beyond TinyStories.

This comment has been hidden (marked as Resolved)

Thanks for the update, this sounds pretty interesting, especially the real-time CPU inference aspect. I would be happy to test it out once it's further along.
A couple questions:
Is the 13000 characters the full training dataset?
What is the context length you are targeting?
Again, are the character cards baked into the weights or injected during inference?

This comment has been hidden (marked as Resolved)

Thank you for your interest and response! We are interested in seeing what the final ternary Mamba model would look like, we would be interested in testing the final model.
As for PIPPA, we used it to train our Gemma3NPC models as well, and I think the licence for PIPPA (Apache 2.0) allows commercial use.
Is there any other way we could communicate other than here?

This comment has been hidden (marked as Resolved)
ncsta changed discussion title from Ternary Mumba to Game model
ncsta changed discussion status to closed
chimbiwide changed discussion status to open

Sorry can you please leave ur email again, I did not get a chance to save it.

Sign up or log in to comment