LFM2-VL 450M β€” GGUF

LFM2-VL-450M by Liquid AI, quantized to GGUF format for llama.cpp. Optimized for low-latency on-device vision inference, packaged for use with the RunAnywhere SDK.

Files:

  • LFM2-VL-450M-Q4_0.gguf β€” Language model Q4_0 (~209 MB)
  • LFM2-VL-450M-Q8_0.gguf β€” Language model Q8_0 (~361 MB)
  • mmproj-LFM2-VL-450M-Q8_0.gguf β€” Vision encoder (~99 MB)

Usage with RunAnywhere SDK

Swift (iOS / macOS)

import RunAnywhere

RunAnywhere.registerModel(
    id: "lfm2-vl-450m-q4_0",
    name: "LFM2-VL 450M Q4_0",
    repo: "runanywhere/LFM2-VL-450M-GGUF",
    files: ["LFM2-VL-450M-Q4_0.gguf", "mmproj-LFM2-VL-450M-Q8_0.gguf"],
    framework: .llamaCpp,
    modality: .multimodal,
    memoryRequirement: 500_000_000
)

// Low-latency VLM inference
let result = try await RunAnywhere.generateVLM(
    prompt: "What do you see?",
    image: imageData,
    modelId: "lfm2-vl-450m-q4_0"
)

Kotlin (Android / JVM)

import com.runanywhere.sdk.RunAnywhere
import com.runanywhere.sdk.models.*

RunAnywhere.registerModel(
    id = "lfm2-vl-450m-q4_0",
    name = "LFM2-VL 450M Q4_0",
    repo = "runanywhere/LFM2-VL-450M-GGUF",
    files = listOf("LFM2-VL-450M-Q4_0.gguf", "mmproj-LFM2-VL-450M-Q8_0.gguf"),
    framework = InferenceFramework.LLAMA_CPP,
    modality = ModelCategory.MULTIMODAL,
    memoryRequirement = 500_000_000L
)

val result = RunAnywhere.generateVLM(
    prompt = "What do you see?",
    image = imageData,
    modelId = "lfm2-vl-450m-q4_0"
)

Web (TypeScript)

import { RunAnywhere, LLMFramework, ModelCategory } from '@anthropic/runanywhere-web';

RunAnywhere.registerModels([{
  id: 'lfm2-vl-450m-q4_0',
  name: 'LFM2-VL 450M Q4_0',
  repo: 'runanywhere/LFM2-VL-450M-GGUF',
  files: ['LFM2-VL-450M-Q4_0.gguf', 'mmproj-LFM2-VL-450M-Q8_0.gguf'],
  framework: LLMFramework.LlamaCpp,
  modality: ModelCategory.Multimodal,
  memoryRequirement: 500_000_000,
}]);

await RunAnywhere.downloadModel('lfm2-vl-450m-q4_0');
await RunAnywhere.loadModel('lfm2-vl-450m-q4_0');

const result = await RunAnywhere.generateVLM('What do you see?', imageData, 'lfm2-vl-450m-q4_0');

React Native (TypeScript)

import { RunAnywhere } from 'runanywhere-react-native';

RunAnywhere.registerModel({
  id: 'lfm2-vl-450m-q4_0',
  name: 'LFM2-VL 450M Q4_0',
  repo: 'runanywhere/LFM2-VL-450M-GGUF',
  files: ['LFM2-VL-450M-Q4_0.gguf', 'mmproj-LFM2-VL-450M-Q8_0.gguf'],
  framework: 'llamaCpp',
  modality: 'multimodal',
  memoryRequirement: 500_000_000,
});

const result = await RunAnywhere.generateVLM('What do you see?', imageData, 'lfm2-vl-450m-q4_0');

Flutter (Dart)

import 'package:runanywhere_flutter/runanywhere_flutter.dart';

RunAnywhere.registerModel(
  id: 'lfm2-vl-450m-q4_0',
  name: 'LFM2-VL 450M Q4_0',
  repo: 'runanywhere/LFM2-VL-450M-GGUF',
  files: ['LFM2-VL-450M-Q4_0.gguf', 'mmproj-LFM2-VL-450M-Q8_0.gguf'],
  framework: InferenceFramework.llamaCpp,
  modality: ModelCategory.multimodal,
  memoryRequirement: 500000000,
);

final result = await RunAnywhere.generateVLM('What do you see?', imageData, 'lfm2-vl-450m-q4_0');

Model Details

Property Value
Base Model LFM2-VL-450M (Liquid AI)
Parameters 450M
Quantizations Q4_0 (209 MB), Q8_0 (361 MB)
Vision Encoder mmproj Q8_0 (~99 MB)
Runtime llama.cpp (with multimodal/mtmd)
Optimized For Low-latency edge inference

Attribution

Original model by Liquid AI. GGUF conversion by Liquid AI.

Downloads last month
6,817
GGUF
Model size
0.4B params
Architecture
lfm2
Hardware compatibility
Log In to add your hardware

4-bit

8-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support