Akicou/Qwen3-30B-A3B-Instruct-REAMINI
This repository contains a REAM'd (Router Expert Activation Pruning via Merging) version of the Qwen3-MoE series. This is an experimental implementation of the REAM methodology, designed to optimize Mixture-of-Experts (MoE) architectures by reducing parameter redundancy without the catastrophic forgetting typically associated with standard pruning.
π§ Methodology: REAM
Router Expert Activation Pruning via Merging (REAM) (originally proposed by bknyaz) is a compression technique for MoE models. Instead of simply dropping less-active experts, REAM:
- Analyzes router activation patterns to identify redundant or under-utilized experts.
- Merges these experts into high-priority "critical" experts.
- Preserves the model's knowledge base while significantly reducing the total parameter footprint.
- Implementation Repo: Akicou/ream
- Release Stage: Smoke-Test (Verifying compute graph integrity and weight coherence).
π Model Statistics
| Metric | Original (Qwen3-30B-A3B) | REAMINI (Post-Merge) | Change |
|---|---|---|---|
| Total Parameters | 30B | 23B | -23.3% |
| Active Parameters | 3B | 3B | 0% |
| Memory Footprint | ~60GB (BF16) | ~46GB (BF16) | -14GB |
π Hardware & Environment
The compression and merging process were conducted using the following infrastructure:
- Compute Provider: Prime Intellect
- GPU: 1 x NVIDIA A100 (80GB)
- Framework: PyTorch / [Akicou-REAM-Lib]
β Why REAMINI?
While the original author (Boris/bknyaz) has not yet released a public implementation, the potential for MoE optimization via merging is too significant to ignore. This release serves as a functional proof-of-concept to demonstrate that merging experts based on router statistics can maintain a coherent compute graph without immediate retraining.
Note: This is a "Smoke-Test" release. While the model is functional, it has not been benchmarked against standard datasets.
π§ͺ Output Verification (Smoke Test)
The following generation confirms the model's ability to handle complex logical reasoning post-transformation.
User: What is the Heyting Algebra thingamajig?
Model Output:
A **Heyting algebra** (also called a **bounded Heyting algebra**) is a structure in mathematical logic and order theory that generalizes the notion of a Boolean algebra to capture the essence of intuitionistic logic. It is named after the Dutch mathematician Arend Heyting, who introduced it in the 1930s.
### Definition
A **Heyting algebra** is a bounded distributive lattice $ (H, \vee, \wedge, 0, 1) $ with an additional operation denoted by $ \rightarrow $ such that:
- $ (x \vee y) = (x \wedge y) $ (distributivity)
- There exists a unary operation $ x \mapsto x \rightarrow y $ such that for all $ x, y \in H $:
- $ x \leq y \iff x \rightarrow y = 1 $
- $ x \rightarrow y \leq x \rightarrow y = 1 $
This last condition can also be expressed as:
For all elements $ x, y \in H $, we have:
### Intuitionistic Logic
The key idea behind Heyting algebras is their connection to **intuitionistic propositional logic**, where the implication operation corresponds to the logical connective $ \rightarrow $. In classical logic, this would be equivalent to a conditional statement $ A \rightarrow B $, but in intuitionistic logic, it's more subtle because not every proposition can be proved or disproved β instead, one must show that assuming $ A $, we get $ B $, so we define $ A \rightarrow B $ to mean that if $ A $ holds then $ B $ holds, which is exactly what the implication operation does.
In particular, the operation $ \rightarrow $ satisfies certain identities like:
and it must satisfy the identity:
Also, there are two special elements in any Heyting algebra:
- $ 0 $: The bottom element (or zero element)
- $ 1 $: The top element (or one element)
These elements correspond to false and true respectively under the interpretation of propositions in intuitionistic logic.
### Example
Letβs consider a simple example [... cut off due to max token limit]
π Citation & Credits
credit to the original Qwen team:
@misc{qwen3technicalreport,
title={Qwen3 Technical Report},
author={Qwen Team},
year={2025},
eprint={2505.09388},
archivePrefix={arXiv}
}
- Downloads last month
- 15