Anybody running this version on DGX Spark ?
#2
by dionode - opened
I found this quantization ideal to run on the DGX Spark, however the NVIDIA container registry most updated version still runs vLLM 15.
I tried to build vLLM fro source but faced multiple out of memory or dependency errors.
Someone already running this version on the Spark ? I think, I'll wait for updates on the container registry.