mihainadas commited on
Commit
cf1c14c
·
verified ·
1 Parent(s): 4aa4948

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +5 -5
README.md CHANGED
@@ -18,18 +18,18 @@ datasets:
18
 
19
  # TF3 Student: Distilled Romanian Language Model
20
 
21
- A compact **35.4M-parameter** Romanian language model distilled from the [TF3-50M teacher](https://huggingface.co/klusai/tf3-50m-base) using logit-based knowledge distillation. Part of the [TinyFabulist](https://arxiv.org/abs/2601.10410) research project.
22
 
23
  ## Model Details
24
 
25
  | Property | Value |
26
  |----------|-------|
27
- | Parameters | 35.4M |
28
  | Architecture | LLaMA-style decoder-only Transformer |
29
- | Hidden size | 512 |
30
- | Attention heads | 8 (head dim 64) |
31
  | Layers | 6 |
32
- | MLP intermediate | 1,380 |
33
  | Vocab size | 32,000 (Unigram, Romanian-specific) |
34
  | Context length | 2,048 tokens |
35
  | Tied embeddings | Yes |
 
18
 
19
  # TF3 Student: Distilled Romanian Language Model
20
 
21
+ A compact **22.9M-parameter** Romanian language model distilled from the [TF3-50M teacher](https://huggingface.co/klusai/tf3-50m-base) using logit-based knowledge distillation. Part of the [TinyFabulist](https://arxiv.org/abs/2601.10410) research project.
22
 
23
  ## Model Details
24
 
25
  | Property | Value |
26
  |----------|-------|
27
+ | Parameters | 22.9M (26.45M with untied embeddings) |
28
  | Architecture | LLaMA-style decoder-only Transformer |
29
+ | Hidden size | 384 |
30
+ | Attention heads | 6 (head dim 64) |
31
  | Layers | 6 |
32
+ | MLP intermediate | 1,024 |
33
  | Vocab size | 32,000 (Unigram, Romanian-specific) |
34
  | Context length | 2,048 tokens |
35
  | Tied embeddings | Yes |