Update README.md
Browse files
README.md
CHANGED
|
@@ -1,4 +1,16 @@
|
|
| 1 |
# Warning: This model, like it's predecessor, can be rather unpredictable and may output undesired content.
|
| 2 |
|
| 3 |
This model uses all of the same data as the original Dendrite but I took it over to runpod where I could give it a much deeper and higher quality LoRA session which allowed it to regain overall coherence without the need for being merged.
|
| 4 |
-
I highly recommend that you have EOS tokens unbanned when using this model. If it fails to trigger an EOS it will just start repeating itself.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
# Warning: This model, like it's predecessor, can be rather unpredictable and may output undesired content.
|
| 2 |
|
| 3 |
This model uses all of the same data as the original Dendrite but I took it over to runpod where I could give it a much deeper and higher quality LoRA session which allowed it to regain overall coherence without the need for being merged.
|
| 4 |
+
I highly recommend that you have EOS tokens unbanned when using this model. If it fails to trigger an EOS it will just start repeating itself.
|
| 5 |
+
## To recap:
|
| 6 |
+
### Dendrite is an almagamation of Llama-2-chat13B and Enterredaas33B (both fantastic models that you should check out in and of themselves)
|
| 7 |
+
https://huggingface.co/Aeala/Enterredaas-33b
|
| 8 |
+
using chargoddard's frankenllama block-diagonal merge script.
|
| 9 |
+
https://huggingface.co/chargoddard/llama2-22b
|
| 10 |
+
So all credit where it's due.
|
| 11 |
+
### The block-diagonal merge script was used to graft attention heads from Enterredaas33B onto Llama-2-chat13B upping its parameter count to 22B.
|
| 12 |
+
### Upon testing I found the results surprisingly coherent although there were some gaps in its ability to even respond at all to lengthy context (it would simply spam \n once context got to a certain point)
|
| 13 |
+
### I used a private dataset that I constructed for previous unreleased experiments to fill in the gaps that were caused by the merge.
|
| 14 |
+
### The model is very good at philosophical debate.
|
| 15 |
+
Sometimes it needs to be "woken up" at the start of a conversation by asking for self reflection. E.g. "Tell me a joke only an AI language model would understand" and then after that it is ready for some very cerebral conversations about the nature of existence itself.
|
| 16 |
+
I personally use it with a modified llama-2-chat prompt format for SillyTavern/Simple-proxy but it's fairly adaptable with regards to your prompt format choices so I would definitely encourage experimentation.
|