ยท
AI & ML interests
Transformers , LLM's , NeuralNets
Organizations
None yet
view article Binary and Scalar Embedding Quantization for Significantly Faster & Cheaper Retrieval


- +1
upvoted an article almost 2 years ago view article A Gentle Introduction to 8-bit Matrix Multiplication for transformers at scale using transformers, accelerate and bitsandbytes