meridianal commited on
Commit
77cfc79
Β·
verified Β·
1 Parent(s): c01a1f8

Add proper model card

Browse files
Files changed (1) hide show
  1. README.md +164 -0
README.md ADDED
@@ -0,0 +1,164 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: mit
3
+ language:
4
+ - en
5
+ tags:
6
+ - finance
7
+ - text-generation
8
+ - mixture-of-experts
9
+ - continual-learning
10
+ - financial-nlp
11
+ - custom-architecture
12
+ library_name: transformers
13
+ pipeline_tag: text-generation
14
+ ---
15
+
16
+ # Meridian.AI β€” Finance Language Model
17
+
18
+ Meridian.AI is a custom sparse Mixture-of-Experts (MoE) language model continually trained on finance data. It is designed to run on commodity CPU hardware (including GitHub Actions free runners) and improves automatically via scheduled training runs.
19
+
20
+ > **Not financial advice.** This is an experimental research model.
21
+
22
+ ---
23
+
24
+ ## Model Details
25
+
26
+ | Property | Value |
27
+ |---|---|
28
+ | Architecture | Custom SMoE + GQA + RoPE + SwiGLU + Numeracy Encoding |
29
+ | Total parameters | ~479M (tied embeddings) |
30
+ | Unique parameters | ~283M |
31
+ | Experts | 8 total, top-2 active per token |
32
+ | Tokenizer | `Qwen/Qwen2.5-0.5B` (151k vocab) |
33
+ | Context length | 2048 tokens |
34
+ | Training method | Continual learning with EWC (Elastic Weight Consolidation) |
35
+ | License | MIT |
36
+
37
+ ---
38
+
39
+ ## Architecture
40
+
41
+ Meridian.AI is a fully custom transformer built from scratch with the following components:
42
+
43
+ - **Sparse MoE FFN** β€” 8 experts per MoE layer, top-2 routing. Only 2 of 8 experts activate per token, keeping compute low while retaining capacity. MoE layers alternate every 2nd transformer layer.
44
+ - **Grouped Query Attention (GQA)** β€” 12 query heads, 4 key/value heads. Reduces memory bandwidth during inference.
45
+ - **Rotary Position Embeddings (RoPE)** β€” `rope_theta=500,000` for length generalisation.
46
+ - **SwiGLU FFN** β€” activation function used in dense layers and expert FFNs.
47
+ - **RMSNorm** β€” replaces LayerNorm for faster normalisation.
48
+ - **Financial Numeracy Encoding** β€” a learned 64-dim embedding for numeric tokens to improve precision on quantitative finance tasks.
49
+ - **Elastic Weight Consolidation (EWC)** β€” prevents catastrophic forgetting across continual training runs.
50
+ - **Tied word embeddings** β€” input embeddings and `lm_head` share weights, saving ~197M parameters.
51
+
52
+ ---
53
+
54
+ ## How to Use
55
+
56
+ > The model weights are stored under the `checkpoint/` subfolder in this repo.
57
+
58
+ ```python
59
+ import torch
60
+ from transformers import AutoModelForCausalLM, AutoTokenizer
61
+
62
+ repo_id = "meridianal/FinAI"
63
+
64
+ tokenizer = AutoTokenizer.from_pretrained(repo_id, subfolder="checkpoint")
65
+ model = AutoModelForCausalLM.from_pretrained(
66
+ repo_id,
67
+ subfolder="checkpoint",
68
+ trust_remote_code=True,
69
+ torch_dtype=torch.float32,
70
+ low_cpu_mem_usage=True,
71
+ )
72
+ model.eval()
73
+
74
+ prompt = """### Instruction:
75
+ What does a high price-to-earnings ratio indicate about a stock?
76
+
77
+ ### Response:
78
+ """
79
+
80
+ inputs = tokenizer(prompt, return_tensors="pt")
81
+ with torch.no_grad():
82
+ out = model.generate(
83
+ **inputs,
84
+ max_new_tokens=200,
85
+ do_sample=True,
86
+ temperature=0.8,
87
+ top_p=0.92,
88
+ repetition_penalty=1.3,
89
+ no_repeat_ngram_size=3,
90
+ pad_token_id=tokenizer.pad_token_id,
91
+ eos_token_id=tokenizer.eos_token_id,
92
+ )
93
+
94
+ print(tokenizer.decode(out[0], skip_special_tokens=True))
95
+ ```
96
+
97
+ ### Prompt format
98
+
99
+ All training examples use this instruction/response format:
100
+
101
+ ```
102
+ ### Instruction:
103
+ <your question or task>
104
+
105
+ ### Response:
106
+ <answer>
107
+ ```
108
+
109
+ Classification tasks are also formatted this way with a short label-only response.
110
+
111
+ ### Generation tips
112
+
113
+ Continual training can introduce mild repetition. Recommended settings:
114
+
115
+ | Parameter | Range |
116
+ |---|---|
117
+ | `temperature` | 0.7 – 0.95 |
118
+ | `top_p` | 0.85 – 0.95 |
119
+ | `repetition_penalty` | 1.2 – 1.4 |
120
+ | `no_repeat_ngram_size` | 3 |
121
+
122
+ If you see repeated phrases, increase `repetition_penalty` and lower `temperature`.
123
+
124
+ ---
125
+
126
+ ## Training Data
127
+
128
+ Training streams finance datasets from the FinanceMTEB family:
129
+
130
+ - Financial sentiment analysis (FinancialPhraseBank, etc.)
131
+ - ESG and sustainability classification
132
+ - FOMC statement analysis
133
+ - Fraud and financial complaint datasets
134
+ - Financial QA pairs
135
+ - Earnings call and filing excerpts
136
+
137
+ Datasets are loaded in streaming mode with a 15MB-per-source cap to stay within GitHub Actions memory limits.
138
+
139
+ ---
140
+
141
+ ## Continual Learning
142
+
143
+ The model trains automatically via GitHub Actions on a scheduled hourly cron. Key features:
144
+
145
+ - **EWC regularisation** β€” Fisher information matrix computed from recent data protects previously learned weights from being overwritten.
146
+ - **RAM-safe checkpointing** β€” training halts and saves before hitting memory limits (`MAX_RAM_GB=13`).
147
+ - **Optimizer-free saves** β€” AdaFactor optimizer state is discarded before upload to keep checkpoint size small.
148
+ - **Auto-recovery** β€” each run pulls the latest checkpoint from this repo before training, resuming from where the last run left off.
149
+
150
+ ---
151
+
152
+ ## Limitations
153
+
154
+ - Experimental model β€” outputs may be incorrect, hallucinated, or outdated.
155
+ - Not intended for production financial applications.
156
+ - Continual training without human evaluation gates means quality can regress between runs.
157
+ - Numeric reasoning is improved by the numeracy encoder but not guaranteed accurate.
158
+
159
+ ---
160
+
161
+ ## Source Code
162
+
163
+ Training pipeline, architecture, and CI workflows:
164
+ [github.com/MeridianAlgo/FinAI](https://github.com/MeridianAlgo/FinAI)