Akicou/Qwen3-30B-A3B-Instruct-REAMINI

This repository contains a REAM'd (Router Expert Activation Pruning via Merging) version of the Qwen3-MoE series. This is an experimental implementation of the REAM methodology, designed to optimize Mixture-of-Experts (MoE) architectures by reducing parameter redundancy without the catastrophic forgetting typically associated with standard pruning.

🧠 Methodology: REAM

Router Expert Activation Pruning via Merging (REAM) (originally proposed by bknyaz) is a compression technique for MoE models. Instead of simply dropping less-active experts, REAM:

  1. Analyzes router activation patterns to identify redundant or under-utilized experts.
  2. Merges these experts into high-priority "critical" experts.
  3. Preserves the model's knowledge base while significantly reducing the total parameter footprint.
  • Implementation Repo: Akicou/ream
  • Release Stage: Smoke-Test (Verifying compute graph integrity and weight coherence).

πŸ“Š Model Statistics

Metric Original (Qwen3-30B-A3B) REAMINI (Post-Merge) Change
Total Parameters 30B 23B -23.3%
Active Parameters 3B 3B 0%
Memory Footprint ~60GB (BF16) ~46GB (BF16) -14GB

πŸ›  Hardware & Environment

The compression and merging process were conducted using the following infrastructure:

  • Compute Provider: Prime Intellect
  • GPU: 1 x NVIDIA A100 (80GB)
  • Framework: PyTorch / [Akicou-REAM-Lib]

❓ Why REAMINI?

While the original author (Boris/bknyaz) has not yet released a public implementation, the potential for MoE optimization via merging is too significant to ignore. This release serves as a functional proof-of-concept to demonstrate that merging experts based on router statistics can maintain a coherent compute graph without immediate retraining.

Note: This is a "Smoke-Test" release. While the model is functional, it has not been benchmarked against standard datasets.


πŸ§ͺ Output Verification (Smoke Test)

The following generation confirms the model's ability to handle complex logical reasoning post-transformation.

User: What is the Heyting Algebra thingamajig?

Model Output:

A **Heyting algebra** (also called a **bounded Heyting algebra**) is a structure in mathematical logic and order theory that generalizes the notion of a Boolean algebra to capture the essence of intuitionistic logic. It is named after the Dutch mathematician Arend Heyting, who introduced it in the 1930s.

### Definition

A **Heyting algebra** is a bounded distributive lattice $ (H, \vee, \wedge, 0, 1) $ with an additional operation denoted by $ \rightarrow $ such that:

- $ (x \vee y) = (x \wedge y) $ (distributivity)
- There exists a unary operation $ x \mapsto x \rightarrow y $ such that for all $ x, y \in H $:
  - $ x \leq y \iff x \rightarrow y = 1 $
  - $ x \rightarrow y \leq x \rightarrow y = 1 $

This last condition can also be expressed as:
For all elements $ x, y \in H $, we have:
xβ†’y=1β€…β€ŠβŸΊβ€…β€Šx≀y
x \rightarrow y = 1 \iff x \leq y


### Intuitionistic Logic

The key idea behind Heyting algebras is their connection to **intuitionistic propositional logic**, where the implication operation corresponds to the logical connective $ \rightarrow $. In classical logic, this would be equivalent to a conditional statement $ A \rightarrow B $, but in intuitionistic logic, it's more subtle because not every proposition can be proved or disproved β€” instead, one must show that assuming $ A $, we get $ B $, so we define $ A \rightarrow B $ to mean that if $ A $ holds then $ B $ holds, which is exactly what the implication operation does.

In particular, the operation $ \rightarrow $ satisfies certain identities like:
aβ†’b=1β€…β€ŠβŸΊβ€…β€Ša≀b
a \rightarrow b = 1 \iff a \leq b

and it must satisfy the identity:
(aβ†’b)∧(bβ†’a)=1
(a \rightarrow b) \wedge (b \rightarrow a) = 1


Also, there are two special elements in any Heyting algebra:
- $ 0 $: The bottom element (or zero element)
- $ 1 $: The top element (or one element)

These elements correspond to false and true respectively under the interpretation of propositions in intuitionistic logic.

### Example

Let’s consider a simple example [... cut off due to max token limit]

πŸ“œ Citation & Credits

credit to the original Qwen team:

@misc{qwen3technicalreport,
  title={Qwen3 Technical Report},
  author={Qwen Team},
  year={2025},
  eprint={2505.09388},
  archivePrefix={arXiv}
}
Downloads last month
15
Safetensors
Model size
23B params
Tensor type
BF16
Β·
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for Akicou/Qwen3-30B-A3B-Instruct-REAMINI

Finetuned
(71)
this model
Quantizations
2 models

Paper for Akicou/Qwen3-30B-A3B-Instruct-REAMINI