ποΈ LiquidFormer (91M) β The Liquid Neural Network Engine
LiquidFormer-91M is a high-performance language model based on the Liquid Neural Network (LNN) architecture. It replaces the standard static Feed-Forward Networks (FFN) found in traditional transformers with Liquid Time-Constant (LTC) cells, allowing for adaptive temporal dynamics and efficient context handling.
ποΈ Architecture Detail
- Model Size: 91 Million Parameters
- Core Cell: Liquid Time-Constant (LTC) ODE-based integration steps.
- Attention: Grouped-Query Attention (GQA) with an 8:2 (4:1) Query-to-KV head ratio.
- Positional Embedding: Rotary Positional Embeddings (RoPE).
- Optimization: Custom "Super Monkey Patch" for dual Tesla T4 GPU environments, achieving up to 18,000 tok/s.
π Training Data
The model underwent a dual-stage training regime to ensure both logical reasoning and conversational fluency:
- Stage 1: Logic & Code Foundation
- Trained on a curated set of 20,000 Gemini-driven code tasks focused on Python development, algorithmic logic, and system design.
- Stage 2: Conversational Alignment
- Fine-tuned on the OpenAssistant Guanaco dataset to align precisely with human instructions and maintain conversational context.
π Usage
You can use this model locally using the LiquidFormer class provided in the repository.
Loading the Model
from architecture.liquidformer import LiquidFormer
# Load from the vault (safetensors format)
model = LiquidFormer.from_pretrained("path/to/LiquidFormer-91M")
Text Generation
from inference.generator import TextGenerator
from tokenizer.tokenizer import LiquidTokenizer
tokenizer = LiquidTokenizer("path/to/tokenizer.json")
generator = TextGenerator(model, tokenizer)
response = generator.generate(
prompt="[USER] Write a Python function for binary search.\n[ASSISTANT]",
max_new_tokens=256,
temperature=0.7
)
print(response)
π Performance
LiquidFormer is designed for efficiency. On a standard GTX 1650 (4GB VRAM), it maintains a minimal memory footprint while delivering low-latency inference thanks to its optimized GQA and LTC dynamics.
Built with β€οΈ for advanced LNN research.
- Downloads last month
- 531