πŸš€ v0.1.6: Real-time Metrics & Blackwell-Optimized Docker (Recommended)

This model is fully compatible with the DGX-Spark-llama.cpp-Bench. Experience the state-of-the-art inference engine optimized for NVIDIA Blackwell (DGX Spark) hardware.

🌟 Key Features (v0.1.6)

  • Real-time Performance Metrics: Now visualizes Input TPS and Output TPS during streaming.
  • Improved Reasoning UI: Seamlessly renders and stabilizes the model's Chain-of-Thought (CoT).
  • Blackwell Optimization: Native support for ARM64/SM121 and CUDA 13.0 FP4.

🐳 Quick Start

# Pull the latest optimized image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.6

For more details, visit our GitHub Repository.


πŸš€ v0.1.6: μ‹€μ‹œκ°„ μ§€ν‘œ 및 Blackwell μ΅œμ ν™” 도컀 (ꢌμž₯)

이 λͺ¨λΈμ€ DGX-Spark-llama.cpp-Bench μ‹œμŠ€ν…œμ— μ΅œμ ν™”λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. NVIDIA Blackwell (DGX Spark) ν•˜λ“œμ›¨μ–΄μ˜ μ„±λŠ₯을 μ΅œλŒ€λ‘œ ν™œμš©ν•˜μ„Έμš”.

🌟 μ£Όμš” νŠΉμ§• (v0.1.6)

  • μ‹€μ‹œκ°„ μ„±λŠ₯ μ§€ν‘œ μ‹œκ°ν™”: 슀트리밍 쀑 Input TPS 및 Output TPSλ₯Ό μ‹€μ‹œκ°„μœΌλ‘œ ν‘œμ‹œν•©λ‹ˆλ‹€.
  • μ§€λŠ₯ν˜• μΆ”λ‘  UI 고도화: λͺ¨λΈμ˜ μƒκ°ν•˜λŠ” κ³Όμ •(CoT)을 더 μ•ˆμ •μ μœΌλ‘œ λ Œλ”λ§ν•©λ‹ˆλ‹€.
  • Blackwell μ΅œμ ν™”: ARM64/SM121 μ•„ν‚€ν…μ²˜ 및 CUDA 13.0 FP4 가속 지원.

🐳 μ‹€ν–‰ 방법

# μ΅œμ‹  μ΅œμ ν™” 이미지 λ‚΄λ €λ°›κΈ°
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.6

μƒμ„Έν•œ μ‚¬μš©λ²•μ€ GitHub 리포지토리λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.



πŸš€ v0.1.5: Real-time Metrics & Blackwell-Optimized Docker (Recommended)

This model is fully compatible with the DGX-Spark-llama.cpp-Bench. Experience the state-of-the-art inference engine optimized for NVIDIA Blackwell (DGX Spark) hardware.

🌟 Key Features (v0.1.5)

  • Real-time Performance Metrics: Now visualizes Input TPS and Output TPS during streaming.
  • Improved Reasoning UI: Seamlessly renders and stabilizes the model's Chain-of-Thought (CoT).
  • Blackwell Optimization: Native support for ARM64/SM121 and CUDA 13.0 FP4.

🐳 Quick Start

# Pull the latest optimized image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.5

For more details, visit our GitHub Repository.


πŸš€ v0.1.5: μ‹€μ‹œκ°„ μ§€ν‘œ 및 Blackwell μ΅œμ ν™” 도컀 (ꢌμž₯)

이 λͺ¨λΈμ€ DGX-Spark-llama.cpp-Bench μ‹œμŠ€ν…œμ— μ΅œμ ν™”λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. NVIDIA Blackwell (DGX Spark) ν•˜λ“œμ›¨μ–΄μ˜ μ„±λŠ₯을 μ΅œλŒ€λ‘œ ν™œμš©ν•˜μ„Έμš”.

🌟 μ£Όμš” νŠΉμ§• (v0.1.5)

  • μ‹€μ‹œκ°„ μ„±λŠ₯ μ§€ν‘œ μ‹œκ°ν™”: 슀트리밍 쀑 Input TPS 및 Output TPSλ₯Ό μ‹€μ‹œκ°„μœΌλ‘œ ν‘œμ‹œν•©λ‹ˆλ‹€.
  • μ§€λŠ₯ν˜• μΆ”λ‘  UI 고도화: λͺ¨λΈμ˜ μƒκ°ν•˜λŠ” κ³Όμ •(CoT)을 더 μ•ˆμ •μ μœΌλ‘œ λ Œλ”λ§ν•©λ‹ˆλ‹€.
  • Blackwell μ΅œμ ν™”: ARM64/SM121 μ•„ν‚€ν…μ²˜ 및 CUDA 13.0 FP4 가속 지원.

🐳 μ‹€ν–‰ 방법

# μ΅œμ‹  μ΅œμ ν™” 이미지 λ‚΄λ €λ°›κΈ°
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.5

μƒμ„Έν•œ μ‚¬μš©λ²•μ€ GitHub 리포지토리λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.



πŸš€ v0.1.4: Quick Start with Blackwell-Optimized Docker (Recommended)

This model is fully compatible with the DGX-Spark-llama.cpp-Bench. Experience the best performance on NVIDIA Blackwell (DGX Spark) hardware with our optimized inference engine.

🌟 Key Features (v0.1.4)

  • Blackwell Optimized: Native support for ARM64/SM121 and CUDA 13.0 FP4.
  • Intelligent Reasoning UI: Automatic extraction and visualization of reasoning processes (CoT).
  • One-Click Deployment: Standardized environment via GHCR Docker image.

🐳 How to Run

# Pull the latest optimized image
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.4

# Follow the instructions in our repo to serve this model
# GitHub: https://github.com/sowilow/DGX-Spark-llama.cpp-Bench

πŸš€ v0.1.4: Blackwell μ΅œμ ν™” 도컀 ν€΅μŠ€νƒ€νŠΈ (ꢌμž₯)

이 λͺ¨λΈμ€ DGX-Spark-llama.cpp-Bench μ‹œμŠ€ν…œμ— μ΅œμ ν™”λ˜μ–΄ μžˆμŠ΅λ‹ˆλ‹€. NVIDIA Blackwell (DGX Spark) ν•˜λ“œμ›¨μ–΄μ˜ μ„±λŠ₯을 μ΅œλŒ€λ‘œ ν™œμš©ν•˜λŠ” μ΅œμ ν™”λœ μΆ”λ‘  엔진을 κ²½ν—˜ν•΄ λ³΄μ„Έμš”.

🌟 μ£Όμš” νŠΉμ§• (v0.1.4)

  • Blackwell μ΅œμ ν™”: ARM64/SM121 μ•„ν‚€ν…μ²˜ 및 CUDA 13.0 FP4 ν•˜λ“œμ›¨μ–΄ 가속 지원.
  • μ§€λŠ₯ν˜• μΆ”λ‘  UI: λͺ¨λΈμ˜ μƒκ°ν•˜λŠ” κ³Όμ •(CoT)을 μžλ™μœΌλ‘œ κ°μ§€ν•˜κ³  μ‹œκ°ν™”ν•©λ‹ˆλ‹€.
  • κ°„νŽΈν•œ 배포: GHCR 도컀 이미지λ₯Ό 톡해 ν™˜κ²½ μ„€μ • 없이 μ¦‰μ‹œ μ‹€ν–‰ κ°€λŠ₯ν•©λ‹ˆλ‹€.

🐳 μ‹€ν–‰ 방법

# μ΅œμ‹  μ΅œμ ν™” 이미지 λ‚΄λ €λ°›κΈ°
docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:v0.1.4

μƒμ„Έν•œ μ‚¬μš©λ²•μ€ GitHub 리포지토리λ₯Ό μ°Έμ‘°ν•˜μ„Έμš”.



πŸš€ Quick Start with Docker (Recommended)

You can easily run this model using the DGX-Spark-llama.cpp-Bench inference engine. It's pre-configured for high-performance inference on NVIDIA hardware (especially Blackwell/DGX Spark).

1. Pull the Docker Image

docker pull ghcr.io/sowilow/dgx-spark-llama.cpp-bench:latest

2. Run the Inference Server

For detailed configuration and usage, visit the GitHub Repository.


LFM-2.5-1.6B-DGX-Spark-GGUF

This repository contains GGUF-quantized weights for LFM-1.6B-VL, specifically optimized for NVIDIA Blackwell (DGX Spark) hardware.

πŸš€ Key Features

  • Hardware Optimized: Built with CUDA 13.0 and SM121 (Blackwell) native acceleration.
  • Quantization: Q8_0 (8-bit quantization) for high precision and fast inference.
  • Base Model Integration: Linked directly to the original liquidai/LFM-1.6B-VL.

βš–οΈ License & Attribution

This model is a quantized version of the original liquidai/LFM-1.6B-VL and is subject to the Liquid AI Community License.

By using this model, you agree to comply with Liquid AI's licensing terms.

πŸ“‚ Files Included

  • LFM2.5-VL-1.6B-Q8_0.gguf: Main model weights.
  • mmproj-F16.gguf: Multimodal vision projector.

Created using DGX-Spark-llama.cpp-Bench

Downloads last month
493
GGUF
Model size
1B params
Architecture
lfm2
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for sowilow/LFM-2.5-1.6B-DGX-Spark-GGUF

Quantized
(16)
this model

Collection including sowilow/LFM-2.5-1.6B-DGX-Spark-GGUF