zkolter
/

Chat-Tuning-Homework

Text Generation

Model card Files Files and versions

zkolter commited on 11 days ago

Commit

37a1d2b

·

verified ·

1 Parent(s): 8309256

Update README.md

Files changed (1) hide show

README.md +5 -21

README.md CHANGED Viewed

@@ -29,7 +29,7 @@ This is a course-homework model repo containing both checkpoints and derived dat
 ### Architecture
-The checkpoints use the homework transformer architecture with:
 - dimension: 1024
 - feed-forward dimension: 4096
@@ -42,22 +42,10 @@ These values are also stored in `params.json`.
 ### Training Summary
-- `model_base.pth` is the pretrained base checkpoint exported from the 1.1M-step FineWebEDU run.
-- `model_chat.pth` is the chat-tuned checkpoint saved after supervised chat tuning in the homework notebook workflow.
-These files are intended for loading with the homework `LLM` implementation and the corresponding `load_weights(...)` function.
-### Intended Use
-- educational experiments
-- homework reproduction
-- lightweight chat fine-tuning exercises
-### Limitations
-- this is a homework model, not a production model
-- outputs can be repetitive, unstable, or factually incorrect
-- the chat-tuned model was trained on a filtered subset of UltraChat-derived data
 ## Data Card
@@ -68,14 +56,10 @@ These files are intended for loading with the homework `LLM` implementation and
 ### Included Data Files
-- `ultrachat_short.json`: shortened chat-tuning corpus
 - `ultrachat_dpo_pos.json`: preferred responses
 - `ultrachat_dpo_neg.json`: dispreferred responses
-### Data Notes
-These data files are included here for homework reproducibility. They are derived artifacts prepared locally for the assignment workflow rather than canonical upstream dataset exports.
 ## File Format Notes
 - `model_base.pth` and `model_chat.pth` are PyTorch checkpoint dictionaries

 ### Architecture
+The checkpoints use the Homework 5 transformer architecture with:
 - dimension: 1024
 - feed-forward dimension: 4096
 ### Training Summary
+- `model_base.pth` is the pretrained base checkpoint exported from the ~1.1T-token FineWebEDU run.
+- `model_chat.pth` is the chat-tuned checkpoint saved after supervised chat tuning on a subset of the ultrachat200k dataset.
+These file are intended for use with the homework's basic exercises.
 ## Data Card
 ### Included Data Files
+- `ultrachat_short.json`: set of short chat-tuning responses selected from Ultrachat 200k
 - `ultrachat_dpo_pos.json`: preferred responses
 - `ultrachat_dpo_neg.json`: dispreferred responses
 ## File Format Notes
 - `model_base.pth` and `model_chat.pth` are PyTorch checkpoint dictionaries