π v0.1.6: Real-time Metrics & Blackwell-Optimized Docker (Recommended)
This model is fully compatible with the DGX-Spark-llama.cpp-Bench. Experience the state-of-the-art inference engine optimized for NVIDIA Blackwell (DGX Spark) hardware.
π Key Features (v0.1.6)
- Real-time Performance Metrics: Now visualizes
Input TPSandOutput TPSduring streaming. - Improved Reasoning UI: Seamlessly renders and stabilizes the model's Chain-of-Thought (CoT).
- Blackwell Optimization: Native support for ARM64/SM121 and CUDA 13.0 FP4.
π³ Quick Start
# Pull the latest optimized image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.6
For more details, visit our GitHub Repository.
π v0.1.6: μ€μκ° μ§ν λ° Blackwell μ΅μ ν λ컀 (κΆμ₯)
μ΄ λͺ¨λΈμ DGX-Spark-llama.cpp-Bench μμ€ν μ μ΅μ νλμ΄ μμ΅λλ€. NVIDIA Blackwell (DGX Spark) νλμ¨μ΄μ μ±λ₯μ μ΅λλ‘ νμ©νμΈμ.
π μ£Όμ νΉμ§ (v0.1.6)
- μ€μκ° μ±λ₯ μ§ν μκ°ν: μ€νΈλ¦¬λ° μ€
Input TPSλ°Output TPSλ₯Ό μ€μκ°μΌλ‘ νμν©λλ€. - μ§λ₯ν μΆλ‘ UI κ³ λν: λͺ¨λΈμ μκ°νλ κ³Όμ (CoT)μ λ μμ μ μΌλ‘ λ λλ§ν©λλ€.
- Blackwell μ΅μ ν: ARM64/SM121 μν€ν μ² λ° CUDA 13.0 FP4 κ°μ μ§μ.
π³ μ€ν λ°©λ²
# μ΅μ μ΅μ ν μ΄λ―Έμ§ λ΄λ €λ°κΈ°
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.6
μμΈν μ¬μ©λ²μ GitHub 리ν¬μ§ν 리λ₯Ό μ°Έμ‘°νμΈμ.
π v0.1.5: Real-time Metrics & Blackwell-Optimized Docker (Recommended)
This model is fully compatible with the DGX-Spark-llama.cpp-Bench. Experience the state-of-the-art inference engine optimized for NVIDIA Blackwell (DGX Spark) hardware.
π Key Features (v0.1.5)
- Real-time Performance Metrics: Now visualizes
Input TPSandOutput TPSduring streaming. - Improved Reasoning UI: Seamlessly renders and stabilizes the model's Chain-of-Thought (CoT).
- Blackwell Optimization: Native support for ARM64/SM121 and CUDA 13.0 FP4.
π³ Quick Start
# Pull the latest optimized image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.5
For more details, visit our GitHub Repository.
π v0.1.5: μ€μκ° μ§ν λ° Blackwell μ΅μ ν λ컀 (κΆμ₯)
μ΄ λͺ¨λΈμ DGX-Spark-llama.cpp-Bench μμ€ν μ μ΅μ νλμ΄ μμ΅λλ€. NVIDIA Blackwell (DGX Spark) νλμ¨μ΄μ μ±λ₯μ μ΅λλ‘ νμ©νμΈμ.
π μ£Όμ νΉμ§ (v0.1.5)
- μ€μκ° μ±λ₯ μ§ν μκ°ν: μ€νΈλ¦¬λ° μ€
Input TPSλ°Output TPSλ₯Ό μ€μκ°μΌλ‘ νμν©λλ€. - μ§λ₯ν μΆλ‘ UI κ³ λν: λͺ¨λΈμ μκ°νλ κ³Όμ (CoT)μ λ μμ μ μΌλ‘ λ λλ§ν©λλ€.
- Blackwell μ΅μ ν: ARM64/SM121 μν€ν μ² λ° CUDA 13.0 FP4 κ°μ μ§μ.
π³ μ€ν λ°©λ²
# μ΅μ μ΅μ ν μ΄λ―Έμ§ λ΄λ €λ°κΈ°
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.5
μμΈν μ¬μ©λ²μ GitHub 리ν¬μ§ν 리λ₯Ό μ°Έμ‘°νμΈμ.
π v0.1.4: Quick Start with Blackwell-Optimized Docker (Recommended)
This model is fully compatible with the DGX-Spark-llama.cpp-Bench. Experience the best performance on NVIDIA Blackwell (DGX Spark) hardware with our optimized inference engine.
π Key Features (v0.1.4)
- Blackwell Optimized: Native support for ARM64/SM121 and CUDA 13.0 FP4.
- Intelligent Reasoning UI: Automatic extraction and visualization of reasoning processes (CoT).
- One-Click Deployment: Standardized environment via GHCR Docker image.
π³ How to Run
# Pull the latest optimized image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.4
# Follow the instructions in our repo to serve this model
# GitHub: https://github.com/sowilow/DGX-Spark-llama.cpp-Bench
π v0.1.4: Blackwell μ΅μ ν λ컀 ν΅μ€ννΈ (κΆμ₯)
μ΄ λͺ¨λΈμ DGX-Spark-llama.cpp-Bench μμ€ν μ μ΅μ νλμ΄ μμ΅λλ€. NVIDIA Blackwell (DGX Spark) νλμ¨μ΄μ μ±λ₯μ μ΅λλ‘ νμ©νλ μ΅μ νλ μΆλ‘ μμ§μ κ²½νν΄ λ³΄μΈμ.
π μ£Όμ νΉμ§ (v0.1.4)
- Blackwell μ΅μ ν: ARM64/SM121 μν€ν μ² λ° CUDA 13.0 FP4 νλμ¨μ΄ κ°μ μ§μ.
- μ§λ₯ν μΆλ‘ UI: λͺ¨λΈμ μκ°νλ κ³Όμ (CoT)μ μλμΌλ‘ κ°μ§νκ³ μκ°νν©λλ€.
- κ°νΈν λ°°ν¬: GHCR λ컀 μ΄λ―Έμ§λ₯Ό ν΅ν΄ νκ²½ μ€μ μμ΄ μ¦μ μ€ν κ°λ₯ν©λλ€.
π³ μ€ν λ°©λ²
# μ΅μ μ΅μ ν μ΄λ―Έμ§ λ΄λ €λ°κΈ°
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.4
μμΈν μ¬μ©λ²μ GitHub 리ν¬μ§ν 리λ₯Ό μ°Έμ‘°νμΈμ.
π Quick Start with Docker (Recommended)
You can easily run this model using the DGX-Spark-llama.cpp-Bench inference engine. It's pre-configured for high-performance inference on NVIDIA hardware (especially Blackwell/DGX Spark).
1. Pull the Docker Image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:latest
2. Run the Inference Server
For detailed configuration and usage, visit the GitHub Repository.
LFM-2.5-1.6B-DGX-Spark-GGUF
This repository contains GGUF-quantized weights for LFM-1.6B-VL, specifically optimized for NVIDIA Blackwell (DGX Spark) hardware.
π Key Features
- Hardware Optimized: Built with CUDA 13.0 and SM121 (Blackwell) native acceleration.
- Quantization: Q8_0 (8-bit quantization) for high precision and fast inference.
- Base Model Integration: Linked directly to the original liquidai/LFM-1.6B-VL.
βοΈ License & Attribution
This model is a quantized version of the original liquidai/LFM-1.6B-VL and is subject to the Liquid AI Community License.
By using this model, you agree to comply with Liquid AI's licensing terms.
π Files Included
LFM2.5-VL-1.6B-Q8_0.gguf: Main model weights.mmproj-F16.gguf: Multimodal vision projector.
Created using DGX-Spark-llama.cpp-Bench
- Downloads last month
- 493
4-bit
8-bit