GPU-Setup
What GPU setup do you recommend to run Apertus ideally on and what setup might still do the trick?
Apertus is part of an R&D initiative, and the parameters of the technical deployment of our models is largely a matter of third party software and hardware choices. There is no general guidance on system requirements. Speaking from my own experience, it is possible to get the 8B model to run quite well on consumer-level hardware. 12 GB of VRAM is advisable, but quantized versions exist to fit much smaller amounts of memory. To get the full 70B version of Apertus to run, you will ideally need an industrial-strength graphics card. Again, it depends a lot on what kind of performance you expect to have, how many users you will serve etc. There are some reviews online, which may provide relevant performance (tokens / second) benchmarks. I would suggest contacting some of the providers that deploy Apertus, do your own experiments, and please share any useful advice that you hear about.