vishesh-t27 commited on
Commit
7a36923
·
verified ·
1 Parent(s): 7f282bf

Updated Readme.md

Browse files
Files changed (1) hide show
  1. README.md +0 -10
README.md CHANGED
@@ -24,7 +24,6 @@ base_model:
24
 
25
  Nandi-Mini-150M-Instruct is a compact, efficient multilingual language model designed for strong performance in resource-constrained environments. It is pre-trained from scratch on 525 billion tokens and further enhanced through instruction tuning and Direct Preference Optimization (DPO). The model supports English and 10 Indic languages.
26
 
27
- We do not employ any benchmaxing tricks; the model is designed to be genuinely strong and highly effective for fine-tuning on downstream tasks.
28
 
29
  Nandi-Mini-150M-Instruct focuses on maximizing performance per parameter through architectural efficiency rather than scale. It is optimized for edge devices, on-prem deployments, and low-latency applications, making it ideal for resource-constrained environments.
30
  Nandi-Mini-150M-Instruct brings the following key features:
@@ -42,7 +41,6 @@ We’re just getting started with the Nandi series 🚀
42
  - **Nandi-Mini-500M (Base + Instruct)** — Pre-Training Going On
43
  - **Nandi-Mini-1B (Base + Instruct)** — Pre-Training Going On
44
 
45
- We are actively working on expanding the Nandi family to cover a wider range of use cases—from lightweight edge deployments to more capable instruction-tuned systems.
46
 
47
  📢 **Blogs & technical deep-dives coming soon**, where we’ll share:
48
  - Architecture decisions and design trade-offs
@@ -51,14 +49,6 @@ We are actively working on expanding the Nandi family to cover a wider range of
51
 
52
  Stay tuned!
53
 
54
- **This repo contains the instruct Nandi-Mini-150M model**, which has the following features:
55
-
56
- - Type: Causal Language Model
57
- - Training Stage: Pretraining (from scratch)
58
- - Architecture: Transformer decoder with RoPE, RMSNorm, SwiGLU, GQA, tied embeddings, **factorize embeddings**
59
- - Number of Layers: 16*2 [Layer Sharing, effective layer =32]
60
- - Context Length: 2,048 tokens
61
- - Vocabulary Size: 131,072
62
 
63
  ## 🌍 Supported Languages
64
 
 
24
 
25
  Nandi-Mini-150M-Instruct is a compact, efficient multilingual language model designed for strong performance in resource-constrained environments. It is pre-trained from scratch on 525 billion tokens and further enhanced through instruction tuning and Direct Preference Optimization (DPO). The model supports English and 10 Indic languages.
26
 
 
27
 
28
  Nandi-Mini-150M-Instruct focuses on maximizing performance per parameter through architectural efficiency rather than scale. It is optimized for edge devices, on-prem deployments, and low-latency applications, making it ideal for resource-constrained environments.
29
  Nandi-Mini-150M-Instruct brings the following key features:
 
41
  - **Nandi-Mini-500M (Base + Instruct)** — Pre-Training Going On
42
  - **Nandi-Mini-1B (Base + Instruct)** — Pre-Training Going On
43
 
 
44
 
45
  📢 **Blogs & technical deep-dives coming soon**, where we’ll share:
46
  - Architecture decisions and design trade-offs
 
49
 
50
  Stay tuned!
51
 
 
 
 
 
 
 
 
 
52
 
53
  ## 🌍 Supported Languages
54