Nekochan-Molo32-Qwen3-1.7B

A MoLo-enhanced variant of Qwen3-1.7B.

Overview

Nekochan-Molo32-Qwen3-1.7B is a custom model built on top of Qwen3-1.7B-Base, augmented with a MoLo (Mixture-of-LoRA-Experts) architecture (My idea). This model blends Qwen’s strong base capabilities with a lightweight expert-routing system, allowing it to adopt different style “modes” depending on the input.

This release focuses on:

Lightweight MoE-style behavior using LoRA-based experts
Fast inference
Smooth stylistic adaptation driven by a learned router
Small footprint while still offering multi-expert flavor

⚠️ Note:
The full architecture implementation (MoLoModel.py and MoLo layers) will be published soon (maybe) and later packaged into a dedicated library for easy import.

Model Features

Architecture: Qwen3-1.7B with MoLo extension
Experts: 32 LoRA-based experts (router-controlled)
Type: Causal Language Model
Context Length: 32,768
Training: Custom data with style-based expert specialization

This model is primarily designed for text generation with unique stylistic blending enabled through the MoLo expert routing system.

Quickstart

Nope, as it requires a custom model structure..

Deployment

Nekochan-Molo32-Qwen3-1.7B can be served through any framework that supports Hugging Face-style causal language models, including:

SGLang
vLLM
Ollama / LM Studio / llama.cpp
KTransformers

Once the MoLo architecture library is released, these platforms will also support full MoLo expert routing inference without additional configuration.

Best Practices

To get the best results from Nekochan-Molo32-Qwen3-1.7B:

Sampling Recommendations
- Temperature: 0.6–0.9
- Top-p: 0.85–0.95
- Top-k: 20–40
Long-Context Usage
- Use up to 32k tokens for extended reasoning or long-form generation.
Style Control via Experts
- Different prompts may trigger different MoLo experts, leading to varied stylistic outputs.
Router Stability
- If results seem overly uniform, reduce temperature slightly to encourage more controlled expert mixture.

Architecture Availability

The MoLo architecture source code:

MoLoModel.py
MoLoLinear
Router
Expert manager
Configs

will be uploaded soon and later bundled into a dedicated pip installable library for simple usage:

pip install molo-neko

Once published, users will be able to load the model like:

from molo_neko import MoLoModel
model = MoLoModel.from_pretrained("leeminwaan/Nekochan-Molo32-Qwen3-1.7B")

Stay tuned!

PS: I'll train the model first then release the library later, just to test the efficiency PS-1.1: Only the adapter will be pushed.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for leeminwaan/nekochan-molo32-qwen3-1.7B

Base model

Qwen/Qwen3-1.7B-Base

Finetuned

(321)

this model