Very tiny chat models
Collection
Some very tiny random chat models. β’ 1 item β’ Updated
The models in the LH-Tech AI Pin Series are very small models that were trained on starhopp3r/TinyChat.
| Model | Parameters | Training iters | Final Train Loss | Quality | Example Chat |
|---|---|---|---|---|---|
| Pin-5M | 5.37M | 1000 | 3.170788 | Very Poor | Yes, a bright day is shining and makes everything have a good day a lot. |
| Pin-10M | 10.06M | 1500 | 2.562048 | Very Poor | That sounds nice, I agree, it is nice to talk about new ideas. |
| Pin-15M | 14.84M | 1500 | 2.358367 | Low | It is hard to see your plans when you want to enjoy the day. |
| Pin-20M | 21.03M | 1500 | 2.217588 | Medium | Yes, sunny days are wonderful! I love hearing about the sunshine and the sun's shining on. |
| Pin-25M | 26.76M | 1500 | 2.139837 | Medium | Sunny days make everything look brighter, especially with a nice friend who cares. |
| Pin-Ultra-25M | 26.76M | 8000 | ... | ... | Coming soon... |
* All models were prompted with What is the weather like today?.
π We recommend using Pin-Ultra-25M.
We trained on starhopp3r/TinyChat and used the gpt-2 tokenizer.
You can find the full training code for the Pin Model Series in this repo.
Tip: If you want to train one of these models yourself, make sure to adjust the model config like this:
| Model | n_layer | n_head | n_embd | n_inner |
|---|---|---|---|---|
| Pin-5M | 4 | 8 | 96 | 384 |
| Pin-10M | 6 | 8 | 160 | 640 |
| Pin-15M | 8 | 8 | 208 | 832 |
| Pin-20M | 10 | 8 | 256 | 1024 |
| Pin-25M | 12 | 12 | 288 | 1152 |
| Pin-Ultra-25M | 12 | 12 | 288 | 1152 |
Have fun :D
We trained all these models in ~30 minutes on a single T4 GPU in a Kaggle Session.
So you are able to easily recreate all of the Pin model without having to launch a 8xH100 cluster π
You can easily use the favorite model of the Pin series like this:
use.py from this repo.answer = run_pin_inference(user_query, model_id="LH-Tech-AI/Pin", subfolder="Pin-25M") # use your favorite model here, e.g. "Pin-25M" or "Pin-15M"...
user_query = "What is the weather like today?" # insert your prompt here
...to: