Spaces:

ianshank
/

MangoMAS

Sleeping

App Files Files Community

ianshank commited on Feb 20

Commit

ca65210

verified ·

1 Parent(s): cd6510c

Update model_card.md

Browse files

Files changed (1) hide show

model_card.md +112 -112

model_card.md CHANGED Viewed

@@ -1,112 +1,112 @@
----
-language: en
-license: mit
-library_name: pytorch
-tags:
-  - mixture-of-experts
-  - multi-agent
-  - neural-routing
-  - cognitive-architecture
-  - reinforcement-learning
-pipeline_tag: text-classification
----
-# MangoMAS-MoE-7M
-A ~7 million parameter **Mixture-of-Experts** (MoE) neural routing model for multi-agent task orchestration.
-## Model Architecture
-```
-Input (64-dim feature vector from featurize64())
-         │
-    ┌─────┴─────┐
-    │   GATE    │  Linear(64→512) → ReLU → Linear(512→16) → Softmax
-    └─────┬─────┘
-          │
-    ╔═══════════════════════════════════════════════════╗
-    ║     16 Expert Towers (parallel)                    ║
-    ║  Each: Linear(64→512) → ReLU → Linear(512→512)   ║
-    ║        → ReLU → Linear(512→256)                    ║
-    ╚═══════════════════════════════════════════════════╝
-          │
-    Weighted Sum (gate_weights × expert_outputs)
-          │
-    Classifier Head: Linear(256→N_classes)
-          │
-       Output Logits
-```
-### Parameter Count
-| Component | Parameters |
-|-----------|-----------|
-| Gate Network | 64×512 + 512 + 512×16 + 16 = ~41K |
-| 16 Expert Towers | 16 × (64×512 + 512 + 512×512 + 512 + 512×256 + 256) = ~6.9M |
-| Classifier Head | 256×10 + 10 = ~2.6K |
-| **Total** | **~6.95M** |
-## Input: 64-Dimensional Feature Vector
-The model consumes a 64-dimensional feature vector produced by `featurize64()`:
-- **Dims 0-31**: Hash-based sinusoidal encoding (content fingerprint)
-- **Dims 32-47**: Domain tag detection (code, security, architecture, etc.)
-- **Dims 48-55**: Structural signals (length, punctuation, questions)
-- **Dims 56-59**: Sentiment polarity estimates
-- **Dims 60-63**: Novelty/complexity scores
-## Training
-- **Optimizer**: AdamW (lr=1e-4, weight_decay=0.01)
-- **Updates**: Online learning from routing feedback
-- **Minimum reward threshold**: 0.1
-- **Device**: CPU / MPS / CUDA (auto-detected)
-## Usage
-```python
-import torch
-from moe_model import MixtureOfExperts7M, featurize64
-# Create model
-model = MixtureOfExperts7M(num_classes=10, num_experts=16)
-# Extract features
-features = featurize64("Design a secure REST API with authentication")
-x = torch.tensor([features], dtype=torch.float32)
-# Forward pass
-logits, gate_weights = model(x)
-print(f"Expert weights: {gate_weights}")
-print(f"Top expert: {gate_weights.argmax().item()}")
-```
-## Intended Use
-This model is part of the **MangoMAS** multi-agent orchestration platform. It routes incoming tasks to the most appropriate expert agents based on the task's semantic content.
-**Primary use cases:**
-- Multi-agent task routing
-- Expert selection for cognitive cell orchestration
-- Research demonstration of MoE architectures
-## Interactive Demo
-Try the model live on the [MangoMAS HuggingFace Space](https://huggingface.co/spaces/ianshank/MangoMAS).
-## Citation
-```bibtex
-@software{mangomas2026,
-  title={MangoMAS: Multi-Agent Cognitive Architecture},
-  author={Shanker, Ian},
-  year={2026},
-  url={https://github.com/ianshank/MangoMAS}
-}
-```
-## Author
-Built by [Ian Shanker](https://huggingface.co/ianshank) — MangoMAS Engineering

+---
+language: en
+license: mit
+library_name: pytorch
+tags:
+  - mixture-of-experts
+  - multi-agent
+  - neural-routing
+  - cognitive-architecture
+  - reinforcement-learning
+pipeline_tag: text-classification
+---
+# MangoMAS-MoE-7M
+A ~7 million parameter **Mixture-of-Experts** (MoE) neural routing model for multi-agent task orchestration.
+## Model Architecture
+```
+Input (64-dim feature vector from featurize64())
+         │
+    ┌─────┴─────┐
+    │   GATE    │  Linear(64→512) → ReLU → Linear(512→16) → Softmax
+    └─────┬─────┘
+          │
+    ╔═══════════════════════════════════════════════════╗
+    ║     16 Expert Towers (parallel)                    ║
+    ║  Each: Linear(64→512) → ReLU → Linear(512→512)   ║
+    ║        → ReLU → Linear(512→256)                    ║
+    ╚═══════════════════════════════════════════════════╝
+          │
+    Weighted Sum (gate_weights × expert_outputs)
+          │
+    Classifier Head: Linear(256→N_classes)
+          │
+       Output Logits
+```
+### Parameter Count
+| Component | Parameters |
+|-----------|-----------|
+| Gate Network | 64×512 + 512 + 512×16 + 16 = ~41K |
+| 16 Expert Towers | 16 × (64×512 + 512 + 512×512 + 512 + 512×256 + 256) = ~6.9M |
+| Classifier Head | 256×10 + 10 = ~2.6K |
+| **Total** | **~6.95M** |
+## Input: 64-Dimensional Feature Vector
+The model consumes a 64-dimensional feature vector produced by `featurize64()`:
+- **Dims 0-31**: Hash-based sinusoidal encoding (content fingerprint)
+- **Dims 32-47**: Domain tag detection (code, security, architecture, etc.)
+- **Dims 48-55**: Structural signals (length, punctuation, questions)
+- **Dims 56-59**: Sentiment polarity estimates
+- **Dims 60-63**: Novelty/complexity scores
+## Training
+- **Optimizer**: AdamW (lr=1e-4, weight_decay=0.01)
+- **Updates**: Online learning from routing feedback
+- **Minimum reward threshold**: 0.1
+- **Device**: CPU / MPS / CUDA (auto-detected)
+## Usage
+```python
+import torch
+from moe_model import MixtureOfExperts7M, featurize64
+# Create model
+model = MixtureOfExperts7M(num_classes=10, num_experts=16)
+# Extract features
+features = featurize64("Design a secure REST API with authentication")
+x = torch.tensor([features], dtype=torch.float32)
+# Forward pass
+logits, gate_weights = model(x)
+print(f"Expert weights: {gate_weights}")
+print(f"Top expert: {gate_weights.argmax().item()}")
+```
+## Intended Use
+This model is part of the **MangoMAS** multi-agent orchestration platform. It routes incoming tasks to the most appropriate expert agents based on the task's semantic content.
+**Primary use cases:**
+- Multi-agent task routing
+- Expert selection for cognitive cell orchestration
+- Research demonstration of MoE architectures
+## Interactive Demo
+Try the model live on the [MangoMAS HuggingFace Space](https://huggingface.co/spaces/ianshank/MangoMAS).
+## Citation
+```bibtex
+@software{mangomas2026,
+  title={MangoMAS: Multi-Agent Cognitive Architecture},
+  author={Cruickshank, Ian},
+  year={2026},
+  url={https://github.com/ianshank/MangoMAS}
+}
+```
+## Author
+Built by [Ian Cruickshank](https://huggingface.co/ianshank) — MangoMAS Engineering