General
This is a variation of the SHARE-14B-Base model that where context length was extended to 8192 tokens with 5000 documents. For all relevant details on the model, plese check the SHARE-14B-Base model card: https://huggingface.co/Joaoffg/SHARE-14B-Base-2604
Recommended use
We recommend using this version of the model exclusively for tasks that require a longer context length. Given that this context expansion was applied to an intermediate checkpoint, we did not spend enough compute to achieve comparable performance to the 4096 context model. Therefore, we advise use of the latter for most us cases.
Once model training is finalized, more robust long context versions of the model will be made available.
- Downloads last month
- 21
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support