Spaces:
Running
Running
metadata
title: README
emoji: π
colorFrom: gray
colorTo: purple
sdk: static
pinned: false
π Join the Pruna AI community!
π Make AI models faster, cheaper, smaller, greener!
Pruna AI makes AI models faster, cheaper, smaller, and greener with the pruna package.
- It supports various models, including CV, NLP, audio, and graphs for predictive and generative AI.
- It supports various hardware, including GPU, CPU, Edge.
- It supports various compression algorithms, including quantization, pruning, distillation, caching, recovery, compilation, or factorization, among others.
- You can combine algorithms to find the optimal configuration and smash/compress your model.
- You can evaluate reliable quality and efficiency metrics of your base vs smashed/compressed models.
Set it up in minutes and compress your first models in a few lines of code!
β© How to get started?
You can smash your own models by installing pruna with pip:
pip install pruna
You can start with simple notebooks to experience efficiency gains with:
| Use Case | Free Notebooks |
|---|---|
| 3x Faster Stable Diffusion Models | β© Smash for free |
| Making your LLMs 4x smaller | β© Smash for free |
| Smash your model with a CPU only | β© Smash for free |
| Transcribe 2 hours of audio in less than 2 minutes with Whisper | β© Smash for free |
| 100% faster Whisper Transcription | β© Smash for free |
| Run your Flux model without an A100 | β© Smash for free |
| x2 smaller Sana in action | β© Smash for free |
For more details on installation and free tutorials, check the Pruna AI documentation.
β¨ Test our Performance Models
Want to use our optimized models right away? Try them via our API for fast, easy access to Pruna-powered inference.