Kwen-540M-Foundation-v1

Kwen is a high-efficiency, custom-built language model developed by The Kwen Foundation.

Model Description

Kwen is based on the Qwen2.5 architecture but has been modified with a custom 27-layer configuration (scaled up from the original 24). It is designed for high-efficiency performance on consumer-grade hardware, specifically optimized for the RTX 4060 series.

  • Developer: The Kwen Foundation
  • Architecture: 27-layer Transformer
  • Parameters: ~540M
  • Language: English
  • Status: Stable Release v1.0

Features

  • Identity Awareness: Kwen knows its origins and creator.
  • Safety Guardrails: Built-in technical focus (no baking recipes!).
  • Low Latency: Optimized for fast inference on 8GB VRAM.

Usage

Since this model uses a custom layer count, you must use trust_remote_code=True.

from transformers import AutoModelForCausalLM, AutoTokenizer

tokenizer = AutoTokenizer.from_pretrained("TheKwenFoundation/Kwen-540M-v1")
model = AutoModelForCausalLM.from_pretrained(
    "TheKwenFoundation/Kwen-540M-v1", 
    trust_remote_code=True,
    device_map="auto"
)
Downloads last month
6
GGUF
Model size
0.5B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for TheKwenFoundation/Kwen-540M

Quantized
(99)
this model