Czech Semantic Embeddings (GGUF, Non-Commercial)

GGUF conversions of Czech semantic embedding models from Seznam with non-commercial license terms.

Included Models

  • Seznam__dist-mpnet-czeng-cs-en.f16.gguf
  • Seznam__dist-mpnet-czeng-cs-en.q8_0.gguf
  • Seznam__simcse-dist-mpnet-czeng-cs-en.f16.gguf
  • Seznam__simcse-dist-mpnet-czeng-cs-en.q8_0.gguf

Upstream Sources

Citation

If you use this model, please cite the original Seznam paper:

@inproceedings{bednavr2024some,
  title={Some Like It Small: Czech Semantic Embedding Models for Industry Applications},
  author={Bedn{\'a}{\v{r}}, Ji{\v{r}}{\'\i} and N{\'a}plava, Jakub and Baran{\v{c}}{\'\i}kov{\'a}, Petra and Lisick{\`y}, Ond{\v{r}}ej},
  booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},
  volume={38},
  number={21},
  pages={22734--22742},
  year={2024}
}

Usage (llama.cpp)

llama-server -m Seznam__dist-mpnet-czeng-cs-en.q8_0.gguf --embedding --pooling cls

File Integrity

SHA256 checksums are in checksums.txt.

License

This package contains converted checkpoints from upstream models. Respect original model license and terms:

  • dist-mpnet-czeng-cs-en: CC-BY-NC-SA-4.0
  • simcse-dist-mpnet-czeng-cs-en: CC-BY-NC-SA-4.0

Non-commercial use only. Attribution to Seznam and original model card is required.

Downloads last month
12
GGUF
Model size
24.3M params
Architecture
bert
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for Veol-CZ/czech-semantic-embeddings