Model Card for ethicalabs/Echo-DSRN-114M-v0.1.2-Base

The Echo-DSRN(N) (Dual State Recurrent Neural Network, short name: Echo-DSRN, also know as echo) is a novel architecture specifically designed to be a viable alternative for low-resource tasks that are currently being inefficiently handled by the excessive scale of Large Language Models (LLMs) 🌱

⚠️ Important Notice

This is a research prototype and demo model.

Not production-ready
Will hallucinate and give incorrect answers
Do not use for any real-world decisions
Intended for architecture experimentation only

What Works

Text generation is fluent
Memory usage is constant O(1)
Runs on CPUs, NPUs, GPUs (Tested on AMD's ROCm and Apple's MPS)

What Doesn't Work

Factual accuracy
Instruction following
Common sense reasoning

🏗️ Architecture Details

Property	Value
Model Type	echo_dsrn
Layers	8
Hidden Dim	512
Attention Heads	4
MLP Ratio	8.0
Vocab Size	32011
Hybrid Attention	True
RMSNorm	True

📊 Parameter Breakdown

Component	Parameters	% of Total
Total	114.69M (114,687,488)	100%
Embeddings	16.39M	14.29%
DSRN Blocks (Aggregate)	81.91M	71.42%
LM Head	16.39M	14.29%

🧩 Internal Block Structure (Per Layer)

Sub-Component	Parameters	Description
MLP (Feed-Forward)	4.20M	Upscaled hidden layers
DSRN Slow State	3.15M	Constant-time memory gates
GRU Fast State	1.58M	Recurrent fast path
Surprise Gating	264,192	Dynamic focus mechanism
Normalization	1,024	LayerNorm / RMSNorm