flwrlabs community gathered in London for their Summit, we released ethicalabs/FlowerTune-Echo-DSRN-114M-Finance-PEFT onto the Flower Hub: a federated PEFT adapter for financial sentiment, built on a novel architecture called Echo-DSRN, a project I started working on 2 years ago.
The core problem we set out to solve: financial data on ledgers, earnings calls, tick streams, blows up the memory footprint of standard Transformers.
KV-Cache scaling makes federated training on the edge increasingly difficult. You cannot preserve data privacy if your decentralized nodes keep running out of memory.
Echo-DSRN addresses this at the architectural level. It uses a dual recurrent state design: a GRU fast path for short-range dynamics, and a surprise-gated slow memory whose write intensity is modulated by prediction error.
The result is O(1) memory regardless of context length. Runs on CPU, AMD ROCm, Apple MPS, NVIDIA GPUs.
Combined with the Flower federated framework, financial institutions can now run local fine-tuning on proprietary data without it ever leaving their infrastructure.
Results on standard financial sentiment benchmarks: β FPB: 70.2% β TFNS: 70.2% β FIQA: 63.8%
This is a 114M baseline. The next step is scaling.
The surprise gating mechanism independently converged on what
google described in their Titans paper. No working open implementation existed. This one does.
Echo-DSRN-114M: A Constant-Memory O(1) Semantic Compressor
While traditional models target general conversational reasoning, Echo-DSRN(N) is a specialized structural prototype.
It is a dual-state recurrent neural network engineered strictly for low-latency semantic compression and continuous text streaming with a permanent O(1) memory footprint.
βοΈ Echo-DSRN (114M Parameters) manages context via continuous structural compression:
- 8 Layers | 512 Hidden Dim - Transformer Fast State + DSRN/GRU Recurrent Slow State + Surprise Gating
Initial pre-training on a single AMD Instinct MI300X, followed by localized refinement across AMD Radeon PRO GPUs and an AMD Ryzen AI Max+ 395 (Strix Halo).
π₯οΈ A Hugging Face Space showcasing the architecture is currently running on the free shared CPU tier.
- The Compressor: Ingest a long document and crush it into a fixed 2048-dimensional .npy state vector. - Vector Similarity: Upload two compressed .npy states to instantly calculate cosine similarity for ultra-lightweight RAG pre-filtering. - The CPU Streamer: Continuous, fluent text generation running on raw CPU compute.
β οΈ Disclaimer: This is a structural prototype. It has internalized formatting and conversational syntax, but it possesses zero world knowledge. It will confidently hallucinate. Use it for streaming transcription, style mimicry, and local semantic hashing, not for factual reasoning.