Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
kostakoff 
posted an update 29 days ago
Post
168
Just for fun, let's run the Alibaba MNN benchmark on a DGX Spark!

From time to time, I look for something new or unusual in the AI world, and recently I stumbled upon MNN — a direct competitor to llama.cpp.

I found this project intriguing and set a small goal for myself: to run it on my DGX Spark. I was glad to see that MNN is open-source under the Apache 2.0 license, meaning I was free to fork and modify it.

However, MNN had a few issues out of the box:
- No support for CUDA 13.0
- No support for the Blackwell architecture sm_12
- No built-in support for CUDA benchmarking

I tackled these issues one by one and successfully compiled MNN on the DGX Spark. The benchmark results are currently quite low, but at least it works! Patch file here https://github.com/alibaba/MNN/issues/4289#issuecomment-4093931887

Here is the step-by-step guide on how I built MNN:
mkdir mnn && cd mnn
# Get the code
git clone https://github.com/alibaba/MNN.git
cd MNN

# Reset repo to a specific commit
git reset --hard b1d06d68b3366183d157f0703d7b8a8b61ae55b3

# Apply patch for CUDA 13.0
git apply ../my_changes.patch

mkdir build && cd build
# Configure the project
cmake .. \
  -DMNN_CUDA=ON \
  -DMNN_BUILD_LLM=ON \
  -DMNN_SUPPORT_TRANSFORMER_FUSE=ON \
  -DCMAKE_BUILD_TYPE=Release

# Build libraries and executable binaries
cmake --build . --config Release -j$(nproc)
make -j$(nproc)


How to run the test:
- Download the MNN model: taobao-mnn/Qwen3-30B-A3B-MNN
- Run the benchmark:
./MNN/build/llm_bench -m /path/to/qwen/config.json -a cuda -c 2 -p 512 -n 128 -kv true -rep 3

It works!
Hopefully, the MNN developers will add official CUDA 13 support.

llmlaba


In this post