yuzhe commited on
Commit
abe49dc
·
1 Parent(s): 9d565aa

Correct README base model to Qwen3.5-4B

Browse files
Files changed (1) hide show
  1. README.md +7 -7
README.md CHANGED
@@ -19,7 +19,7 @@ metrics:
19
  pipeline_tag: text-generation
20
  inference: false
21
  base_model:
22
- - Qwen/Qwen3.5-27B
23
  ---
24
 
25
 
@@ -27,7 +27,7 @@ base_model:
27
  <tr>
28
  <td bgcolor="#EEF6FF" style="padding: 14px 16px; border-left: 6px solid #2563EB;">
29
  <strong>Update Notice</strong><br>
30
- This release has been retrained on the same dataset with the base model upgraded to <strong>Qwen3.5-27B</strong>. If any section below has not yet been fully refreshed, this notice takes precedence.
31
  </td>
32
  </tr>
33
  </table>
@@ -65,12 +65,12 @@ The DMind lineage was born from a singular conviction that decentralized finance
65
 
66
  * **Model Name:** DMind-3-mini
67
  * **Organization:** DMind
68
- * **Base Architecture:** Qwen3.5-27B (Customized Transformer w/ RoPE)
69
- * **Parameter Count:** 27 Billion
70
  * **Precision:** **BF16 (Native)**
71
  * *⚠️ Note: We strictly advise against 4-bit quantization for financial logic tasks to preserve numerical precision in APY/IL calculations.*
72
  * **Context Window:** 128k tokens
73
- * **Hardware Requirement:** BF16 deployment is recommended on a single **80GB-class GPU** (such as A100/H100) or an equivalent multi-GPU setup.
74
 
75
  ## 3. 🔬 Methodology: C³-SFT
76
 
@@ -140,10 +140,10 @@ Evaluated on three key benchmarks: DMind Benchmark (Web3 Native Logic), FinanceQ
140
 
141
  ![Figure 3: Performance Benchmarks](./Figures/Figure3.png)
142
 
143
- The evaluation compares the current DMind-3-mini release, now based on Qwen3.5-27B, against top-tier frontier models (GPT-5.1, Claude Sonnet 4.5) and other strong baselines. This version prioritizes deeper reasoning capacity and domain performance over ultra-lightweight deployment, especially in specialized financial tasks.
144
 
145
  ## 7. ⚖️ Limitations & Disclaimer
146
 
147
- * **High Hardware Barrier:** Due to the decision to retain BF16 precision for financial accuracy, straightforward deployment typically requires **80GB-class GPU memory or an equivalent multi-GPU setup**. It is not suitable for standard office laptops.
148
  * **Knowledge Cutoff:** While the logic is robust, specific protocol data is limited to the training cutoff. Use with RAG for real-time data.
149
  * **Legal Disclaimer:** This model is an **analytical tool**, not a financial advisor. The output (NFA) should never be the sole basis for investment decisions. The developers assume no liability for financial losses.
 
19
  pipeline_tag: text-generation
20
  inference: false
21
  base_model:
22
+ - Qwen/Qwen3.5-4B
23
  ---
24
 
25
 
 
27
  <tr>
28
  <td bgcolor="#EEF6FF" style="padding: 14px 16px; border-left: 6px solid #2563EB;">
29
  <strong>Update Notice</strong><br>
30
+ This release has been retrained on the same dataset with the base model upgraded to <strong>Qwen3.5-4B</strong>. If any section below has not yet been fully refreshed, this notice takes precedence.
31
  </td>
32
  </tr>
33
  </table>
 
65
 
66
  * **Model Name:** DMind-3-mini
67
  * **Organization:** DMind
68
+ * **Base Architecture:** Qwen3.5-4B (Customized Transformer w/ RoPE)
69
+ * **Parameter Count:** 4.2 Billion
70
  * **Precision:** **BF16 (Native)**
71
  * *⚠️ Note: We strictly advise against 4-bit quantization for financial logic tasks to preserve numerical precision in APY/IL calculations.*
72
  * **Context Window:** 128k tokens
73
+ * **Hardware Requirement:** GPU with $\ge$ **12GB VRAM** (Recommended: NVIDIA RTX 4070Ti+, Apple M3/M4 Pro/Max).
74
 
75
  ## 3. 🔬 Methodology: C³-SFT
76
 
 
140
 
141
  ![Figure 3: Performance Benchmarks](./Figures/Figure3.png)
142
 
143
+ The evaluation compares DMind-3-mini (4B) against top-tier frontier models (GPT-5.1, Claude Sonnet 4.5) and other efficient models. Despite its compact size, the Mini model demonstrates exceptional efficiency, particularly in specialized domain tasks where it outperforms significantly larger generalist models.
144
 
145
  ## 7. ⚖️ Limitations & Disclaimer
146
 
147
+ * **High Hardware Barrier:** Due to the decision to retain BF16 precision for financial accuracy, this model **requires >= 12GB VRAM**. It is not suitable for standard office laptops.
148
  * **Knowledge Cutoff:** While the logic is robust, specific protocol data is limited to the training cutoff. Use with RAG for real-time data.
149
  * **Legal Disclaimer:** This model is an **analytical tool**, not a financial advisor. The output (NFA) should never be the sole basis for investment decisions. The developers assume no liability for financial losses.