Update README.md
Browse files
README.md
CHANGED
|
@@ -16,7 +16,7 @@ datasets:
|
|
| 16 |
# TinyWay-1.1.0
|
| 17 |
|
| 18 |
**TinyWay-1.1.0** is a lightweight **decoder-only Transformer language model** trained **from scratch** on limited compute.
|
| 19 |
-
The project demonstrates that meaningful language modeling behavior can emerge from modest-scale models trained in constrained environments such as
|
| 20 |
|
| 21 |
> **Core idea:** *Understanding LLM training mechanics end-to-end by building, training, debugging, and deploying a Transformer LM without relying on pretrained weights.*
|
| 22 |
|
|
@@ -56,7 +56,7 @@ The project demonstrates that meaningful language modeling behavior can emerge f
|
|
| 56 |
* Gradient accumulation: enabled
|
| 57 |
* Gradient clipping: enabled
|
| 58 |
* Mixed precision training (AMP)
|
| 59 |
-
* Training performed entirely on **
|
| 60 |
|
| 61 |
### Checkpoints
|
| 62 |
|
|
@@ -158,5 +158,5 @@ ITM Gwalior, India
|
|
| 158 |
## Acknowledgements
|
| 159 |
|
| 160 |
* Hugging Face Transformers
|
| 161 |
-
*
|
| 162 |
* Open research community for open-source inspiration
|
|
|
|
| 16 |
# TinyWay-1.1.0
|
| 17 |
|
| 18 |
**TinyWay-1.1.0** is a lightweight **decoder-only Transformer language model** trained **from scratch** on limited compute.
|
| 19 |
+
The project demonstrates that meaningful language modeling behavior can emerge from modest-scale models trained in constrained environments such as.
|
| 20 |
|
| 21 |
> **Core idea:** *Understanding LLM training mechanics end-to-end by building, training, debugging, and deploying a Transformer LM without relying on pretrained weights.*
|
| 22 |
|
|
|
|
| 56 |
* Gradient accumulation: enabled
|
| 57 |
* Gradient clipping: enabled
|
| 58 |
* Mixed precision training (AMP)
|
| 59 |
+
* Training performed entirely on **GPU environment**
|
| 60 |
|
| 61 |
### Checkpoints
|
| 62 |
|
|
|
|
| 158 |
## Acknowledgements
|
| 159 |
|
| 160 |
* Hugging Face Transformers
|
| 161 |
+
* GPU resources
|
| 162 |
* Open research community for open-source inspiration
|