General

This is a variation of the SHARE-14B-Base model that where context length was extended to 8192 tokens with 5000 documents. For all relevant details on the model, plese check the SHARE-14B-Base model card: https://huggingface.co/Joaoffg/SHARE-14B-Base-2604

Recommended use

We recommend using this version of the model exclusively for tasks that require a longer context length. Given that this context expansion was applied to an intermediate checkpoint, we did not spend enough compute to achieve comparable performance to the 4096 context model. Therefore, we advise use of the latter for most us cases.

Once model training is finalized, more robust long context versions of the model will be made available.

Downloads last month: 21

Safetensors

Model size

14B params

Tensor type

BF16

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support