Proprietary Invention Package – Ternary-Quantized Transformer Optimization

Inventor: Konstantin Vladimirovich Grabko
Email: grabko@cmsmanhattan.com
Date: December 22, 2025

Overview: This package contains documentation for a novel, proprietary method enabling efficient LLM inference on AMD ROCm hardware using ternary quantization, BRE, and SWA fusion.

Contents:

license.md
NDA.md
invention_description.md
claims.md
performance_data.md
[Diagrams and attachments]

Confidential: All materials are proprietary. Contact inventor for licensing discussions. JiRack Ternary MoE 405b model

JiRack Ternary MoE 405B — The Next Evolution in Ultra-Efficient Giant Models
Introducing JiRack Ternary MoE 405B: a groundbreaking 405-billion-parameter language model that combines ternary quantization (weights restricted to {-1, 0, +1}) with a powerful Mixture of Experts (MoE) architecture.
What makes it special?
The JiRack Agentic AI system is elegantly packed directly into the model as a collection of highly specialized experts. Thanks to the MoE design, JiRack Ternary 405B effectively hosts far more experts than a traditional dense model of the same active-parameter footprint — unlocking massive capacity while staying dramatically more efficient in compute, memory, and energy use.
This hybrid approach draws inspiration from brain-like efficiency (ternary weights mimic ultra-low-precision biological signaling) while delivering top-tier performance through dynamic expert routing. The result: a frontier-scale LLM that's smarter, leaner, and more agentic by design.
JiRack Ternary 405B isn't just bigger — it's fundamentally more intelligent about how it thinks and scales.

JiRack Ternary MoE 405B — Ultra-Efficient Frontier-Scale Intelligence

Introducing JiRack Ternary MoE 405B: a revolutionary 405-billion-parameter language model that fuses ternary quantization (weights constrained to {-1, 0, +1} for extreme efficiency) with a powerful Mixture of Experts (MoE) architecture — inspired by BitNet-style paradigms and pushing the boundaries of brain-like compute.
How JiRack achieves massive scale with unmatched efficiency:
Agentic AI Packed as Experts — The JiRack Agentic AI system is seamlessly embedded into the model as a dynamic collection of highly specialized experts. The MoE design allows JiRack Ternary 405B to support far more experts than a traditional dense model of equivalent active parameters — delivering enormous capacity while slashing compute, memory, and energy demands dramatically.
Foundation in Ternary 70B Experts — The journey begins with JiRack Ternary 70B, where individual experts are trained separately in a modular, ternary-quantized format. This separable pre-training phase creates highly capable, low-precision specialist modules from the ground up.
Expert Router Training — Once the experts are ready, we train a dedicated expert router (gating network) to intelligently dispatch each incoming request (token or query) to the most relevant experts. This dynamic routing ensures optimal specialization, load balancing, and efficiency — activating only a small subset of the total capacity per inference step.
The result? A hybrid architecture that mimics biological neural efficiency (ternary weights ≈ ultra-sparse, low-energy signaling) while unlocking frontier-level performance through smart, adaptive expert selection. JiRack Ternary MoE 405B isn't merely larger — it's engineered to think smarter, run leaner, and scale further than conventional dense or even standard MoE designs.
Key advantages at a glance:
~70–90% reduction in energy & memory vs. FP16 equivalents
Massive effective parameter count via many lightweight ternary experts
Agentic behavior baked in through specialized, routable modules
Designed for real-world deployment on constrained hardware
JiRack is redefining what's possible at 405B scale — efficient, intelligent, and truly agentic by design.
(Let me know if you'd like this formatted as a blog post, tweet thread, technical abstract, or with more emphasis on benchmarks/training details!)

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support