Update README.md
Browse files
README.md
CHANGED
|
@@ -10,8 +10,31 @@ tags:
|
|
| 10 |
- from-scratch
|
| 11 |
pipeline_tag: text-generation
|
| 12 |
---
|
|
|
|
| 13 |
|
| 14 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 15 |
|
| 16 |
**A small language model trained from scratch to be a Linux terminal. That's it.**
|
| 17 |
|
|
@@ -142,7 +165,7 @@ We haven't tried the new data generator because of budget problems.
|
|
| 142 |
|
| 143 |
But we thought in the meantime that Transformers are too heavy for what LaaLM does.
|
| 144 |
|
| 145 |
-
So we are going to use
|
| 146 |
|
| 147 |
LSTMs are not designed for big models. But LaaLM is very simple, so an LSTM is better for it.
|
| 148 |
|
|
@@ -180,7 +203,7 @@ Our reason is that we have better projects to spend our compute on than a bash p
|
|
| 180 |
|
| 181 |
LaaLM has definitely been a fun experience for us, but we can't just spend precious compute for something this experimental and non-useful.
|
| 182 |
|
| 183 |
-
Maybe future models will keep coming but we definitely recommend you
|
| 184 |
|
| 185 |
---
|
| 186 |
|
|
@@ -242,9 +265,7 @@ We thought of basic bash, but we don't know if we can handle it.
|
|
| 242 |
|
| 243 |
## Status
|
| 244 |
|
| 245 |
-
|
| 246 |
-
|
| 247 |
-
Check back later or watch the repo for updates.
|
| 248 |
|
| 249 |
---
|
| 250 |
|
|
|
|
| 10 |
- from-scratch
|
| 11 |
pipeline_tag: text-generation
|
| 12 |
---
|
| 13 |
+
# 4 April 2026 Update
|
| 14 |
|
| 15 |
+
We have some news.
|
| 16 |
+
|
| 17 |
+
After we thought about how to go with LaaLM, we realized we'd be just spending compute.
|
| 18 |
+
|
| 19 |
+
We had originally thought to at least deliver it with some fine-tuning of Qwen 2.5 0.5B, but we have decided that LaaLM had enough models.
|
| 20 |
+
|
| 21 |
+
So we are announcing that we are officially stopping the LaaLM lineage here.
|
| 22 |
+
|
| 23 |
+
We know it's pretty early with just 2 released models being LaaLM-v1 and LaaLM-exp-v1, and a bunch of failed LaaLM-v2 snapshots that we lost (Like in actual storage sense we don't have acces to the copies.) but we can't see any novel or good perspective about this.
|
| 24 |
+
|
| 25 |
+
Go try prompting Claude or something like that to act as a Linux terminal, and it'll do much better than LaaLM.
|
| 26 |
+
|
| 27 |
+
It's actually pretty bad. The model LaaLM-exp-v1 was trained on -Qwen 2.5 3B- literally performed better than the actual LaaLM-exp-v1 when we gave it LaaLM-exp-v1's system prompt.
|
| 28 |
+
|
| 29 |
+
Or like maybe just install actual Linux.
|
| 30 |
+
|
| 31 |
+
We are thinking of another new model to continue our adventure in predicting computers with neural networks, but we do not want to say it here to keep some mystery.
|
| 32 |
+
|
| 33 |
+
And to make sure we don't announce something that won't make it.
|
| 34 |
+
|
| 35 |
+
Goodbye.
|
| 36 |
+
|
| 37 |
+
# LaaLM-v2 (Deprecated. Anything you'll be reading from now on has been either lost or stopped.)
|
| 38 |
|
| 39 |
**A small language model trained from scratch to be a Linux terminal. That's it.**
|
| 40 |
|
|
|
|
| 165 |
|
| 166 |
But we thought in the meantime that Transformers are too heavy for what LaaLM does.
|
| 167 |
|
| 168 |
+
So we are going to use an LSTM-based architecture.
|
| 169 |
|
| 170 |
LSTMs are not designed for big models. But LaaLM is very simple, so an LSTM is better for it.
|
| 171 |
|
|
|
|
| 203 |
|
| 204 |
LaaLM has definitely been a fun experience for us, but we can't just spend precious compute for something this experimental and non-useful.
|
| 205 |
|
| 206 |
+
Maybe future models will keep coming, but we definitely recommend that you may stop expecting new models after LaaLM-v2. (Well this turned out to be correct.)
|
| 207 |
|
| 208 |
---
|
| 209 |
|
|
|
|
| 265 |
|
| 266 |
## Status
|
| 267 |
|
| 268 |
+
Deprecated.
|
|
|
|
|
|
|
| 269 |
|
| 270 |
---
|
| 271 |
|