Update README.md
Browse files
README.md
CHANGED
|
@@ -29,7 +29,7 @@ This is a course-homework model repo containing both checkpoints and derived dat
|
|
| 29 |
|
| 30 |
### Architecture
|
| 31 |
|
| 32 |
-
The checkpoints use the
|
| 33 |
|
| 34 |
- dimension: 1024
|
| 35 |
- feed-forward dimension: 4096
|
|
@@ -42,22 +42,10 @@ These values are also stored in `params.json`.
|
|
| 42 |
|
| 43 |
### Training Summary
|
| 44 |
|
| 45 |
-
- `model_base.pth` is the pretrained base checkpoint exported from the 1.
|
| 46 |
-
- `model_chat.pth` is the chat-tuned checkpoint saved after supervised chat tuning
|
| 47 |
|
| 48 |
-
These
|
| 49 |
-
|
| 50 |
-
### Intended Use
|
| 51 |
-
|
| 52 |
-
- educational experiments
|
| 53 |
-
- homework reproduction
|
| 54 |
-
- lightweight chat fine-tuning exercises
|
| 55 |
-
|
| 56 |
-
### Limitations
|
| 57 |
-
|
| 58 |
-
- this is a homework model, not a production model
|
| 59 |
-
- outputs can be repetitive, unstable, or factually incorrect
|
| 60 |
-
- the chat-tuned model was trained on a filtered subset of UltraChat-derived data
|
| 61 |
|
| 62 |
## Data Card
|
| 63 |
|
|
@@ -68,14 +56,10 @@ These files are intended for loading with the homework `LLM` implementation and
|
|
| 68 |
|
| 69 |
### Included Data Files
|
| 70 |
|
| 71 |
-
- `ultrachat_short.json`:
|
| 72 |
- `ultrachat_dpo_pos.json`: preferred responses
|
| 73 |
- `ultrachat_dpo_neg.json`: dispreferred responses
|
| 74 |
|
| 75 |
-
### Data Notes
|
| 76 |
-
|
| 77 |
-
These data files are included here for homework reproducibility. They are derived artifacts prepared locally for the assignment workflow rather than canonical upstream dataset exports.
|
| 78 |
-
|
| 79 |
## File Format Notes
|
| 80 |
|
| 81 |
- `model_base.pth` and `model_chat.pth` are PyTorch checkpoint dictionaries
|
|
|
|
| 29 |
|
| 30 |
### Architecture
|
| 31 |
|
| 32 |
+
The checkpoints use the Homework 5 transformer architecture with:
|
| 33 |
|
| 34 |
- dimension: 1024
|
| 35 |
- feed-forward dimension: 4096
|
|
|
|
| 42 |
|
| 43 |
### Training Summary
|
| 44 |
|
| 45 |
+
- `model_base.pth` is the pretrained base checkpoint exported from the ~1.1T-token FineWebEDU run.
|
| 46 |
+
- `model_chat.pth` is the chat-tuned checkpoint saved after supervised chat tuning on a subset of the ultrachat200k dataset.
|
| 47 |
|
| 48 |
+
These file are intended for use with the homework's basic exercises.
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 49 |
|
| 50 |
## Data Card
|
| 51 |
|
|
|
|
| 56 |
|
| 57 |
### Included Data Files
|
| 58 |
|
| 59 |
+
- `ultrachat_short.json`: set of short chat-tuning responses selected from Ultrachat 200k
|
| 60 |
- `ultrachat_dpo_pos.json`: preferred responses
|
| 61 |
- `ultrachat_dpo_neg.json`: dispreferred responses
|
| 62 |
|
|
|
|
|
|
|
|
|
|
|
|
|
| 63 |
## File Format Notes
|
| 64 |
|
| 65 |
- `model_base.pth` and `model_chat.pth` are PyTorch checkpoint dictionaries
|