meridianal
/

FinAI

+---
+license: mit
+language:
+  - en
+tags:
+  - finance
+  - text-generation
+  - mixture-of-experts
+  - continual-learning
+  - financial-nlp
+  - custom-architecture
+library_name: transformers
+pipeline_tag: text-generation
+---
+# Meridian.AI — Finance Language Model
+Meridian.AI is a custom sparse Mixture-of-Experts (MoE) language model continually trained on finance data. It is designed to run on commodity CPU hardware (including GitHub Actions free runners) and improves automatically via scheduled training runs.
+> **Not financial advice.** This is an experimental research model.
+---
+## Model Details
+| Property | Value |
+|---|---|
+| Architecture | Custom SMoE + GQA + RoPE + SwiGLU + Numeracy Encoding |
+| Total parameters | ~479M (tied embeddings) |
+| Unique parameters | ~283M |
+| Experts | 8 total, top-2 active per token |
+| Tokenizer | `Qwen/Qwen2.5-0.5B` (151k vocab) |
+| Context length | 2048 tokens |
+| Training method | Continual learning with EWC (Elastic Weight Consolidation) |
+| License | MIT |
+---
+## Architecture
+Meridian.AI is a fully custom transformer built from scratch with the following components:
+- **Sparse MoE FFN** — 8 experts per MoE layer, top-2 routing. Only 2 of 8 experts activate per token, keeping compute low while retaining capacity. MoE layers alternate every 2nd transformer layer.
+- **Grouped Query Attention (GQA)** — 12 query heads, 4 key/value heads. Reduces memory bandwidth during inference.
+- **Rotary Position Embeddings (RoPE)** — `rope_theta=500,000` for length generalisation.
+- **SwiGLU FFN** — activation function used in dense layers and expert FFNs.
+- **RMSNorm** — replaces LayerNorm for faster normalisation.
+- **Financial Numeracy Encoding** — a learned 64-dim embedding for numeric tokens to improve precision on quantitative finance tasks.
+- **Elastic Weight Consolidation (EWC)** — prevents catastrophic forgetting across continual training runs.
+- **Tied word embeddings** — input embeddings and `lm_head` share weights, saving ~197M parameters.
+---
+## How to Use
+> The model weights are stored under the `checkpoint/` subfolder in this repo.
+```python
+import torch
+from transformers import AutoModelForCausalLM, AutoTokenizer
+repo_id = "meridianal/FinAI"
+tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="checkpoint")
+model = AutoModelForCausalLM.from_pretrained(
+    repo_id,
+    subfolder="checkpoint",
+    trust_remote_code=True,
+    torch_dtype=torch.float32,
+    low_cpu_mem_usage=True,
+)
+model.eval()
+prompt = """### Instruction:
+What does a high price-to-earnings ratio indicate about a stock?
+### Response:
+"""
+inputs = tokenizer(prompt, return_tensors="pt")
+with torch.no_grad():
+    out = model.generate(
+        **inputs,
+        max_new_tokens=200,
+        do_sample=True,
+        temperature=0.8,
+        top_p=0.92,
+        repetition_penalty=1.3,
+        no_repeat_ngram_size=3,
+        pad_token_id=tokenizer.pad_token_id,
+        eos_token_id=tokenizer.eos_token_id,
+    )
+print(tokenizer.decode(out[0], skip_special_tokens=True))
+```
+### Prompt format
+All training examples use this instruction/response format:
+```
+### Instruction:
+<your question or task>
+### Response:
+<answer>
+```
+Classification tasks are also formatted this way with a short label-only response.
+### Generation tips
+Continual training can introduce mild repetition. Recommended settings:
+| Parameter | Range |
+|---|---|
+| `temperature` | 0.7 – 0.95 |
+| `top_p` | 0.85 – 0.95 |
+| `repetition_penalty` | 1.2 – 1.4 |
+| `no_repeat_ngram_size` | 3 |
+If you see repeated phrases, increase `repetition_penalty` and lower `temperature`.
+---
+## Training Data
+Training streams finance datasets from the FinanceMTEB family:
+- Financial sentiment analysis (FinancialPhraseBank, etc.)
+- ESG and sustainability classification
+- FOMC statement analysis
+- Fraud and financial complaint datasets
+- Financial QA pairs
+- Earnings call and filing excerpts
+Datasets are loaded in streaming mode with a 15MB-per-source cap to stay within GitHub Actions memory limits.
+---
+## Continual Learning
+The model trains automatically via GitHub Actions on a scheduled hourly cron. Key features:
+- **EWC regularisation** — Fisher information matrix computed from recent data protects previously learned weights from being overwritten.
+- **RAM-safe checkpointing** — training halts and saves before hitting memory limits (`MAX_RAM_GB=13`).
+- **Optimizer-free saves** — AdaFactor optimizer state is discarded before upload to keep checkpoint size small.
+- **Auto-recovery** — each run pulls the latest checkpoint from this repo before training, resuming from where the last run left off.
+---
+## Limitations
+- Experimental model — outputs may be incorrect, hallucinated, or outdated.
+- Not intended for production financial applications.
+- Continual training without human evaluation gates means quality can regress between runs.
+- Numeric reasoning is improved by the numeracy encoder but not guaranteed accurate.
+---
+## Source Code
+Training pipeline, architecture, and CI workflows:
+[github.com/MeridianAlgo/FinAI](https://github.com/MeridianAlgo/FinAI)