pathcosmos commited on
Commit
9d4531b
·
verified ·
1 Parent(s): 0d1f450

Upload pretrain/README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. pretrain/README.md +60 -0
pretrain/README.md ADDED
@@ -0,0 +1,60 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ language:
3
+ - ko
4
+ - en
5
+ license: apache-2.0
6
+ tags:
7
+ - pretrained
8
+ - causal-lm
9
+ - korean
10
+ - llm
11
+ pipeline_tag: text-generation
12
+ ---
13
+
14
+ # EVAFRILL-Mo 3B — Pretrained Base
15
+
16
+ Raw pretrained language model, the foundation for all EVAFRILL-Mo downstream variants.
17
+
18
+ ## Training Stage
19
+
20
+ Pretraining from scratch on a mixed Korean/English corpus.
21
+
22
+ ## Key Details
23
+
24
+ - **Steps**: 319,772 (Chinchilla ~93% budget)
25
+ - **Tokens**: ~55B tokens
26
+ - **Hardware**: 7× NVIDIA B200 GPUs (DDP)
27
+ - **Precision**: BF16
28
+ - **Architecture**: Transformer decoder, 3B parameters
29
+
30
+ ## Metrics
31
+
32
+ | Metric | Value |
33
+ |--------|-------|
34
+ | Final train loss | — |
35
+ | Chinchilla efficiency | ~93% |
36
+
37
+ ## Notes
38
+
39
+ This is the **raw pretrained model** with no instruction tuning or alignment applied.
40
+ It is not suitable for chat/instruction use directly — use one of the fine-tuned variants below.
41
+
42
+ ## Variants
43
+
44
+ | Variant | Description |
45
+ |---------|-------------|
46
+ | [sft-v2](../sft-v2/) | Instruction-tuned (recommended starting point) |
47
+ | [slerp](../slerp/) | SLERP merge — best overall (recommended) |
48
+ | [dpo-r1](../dpo-r1/) | DPO alignment round 1 |
49
+
50
+ ## Main Model Card
51
+
52
+ See the [main README](../../README.md) for full project details, architecture, and training history.
53
+
54
+ ## Usage
55
+
56
+ ```python
57
+ from transformers import AutoModelForCausalLM, AutoTokenizer
58
+ model = AutoModelForCausalLM.from_pretrained("path/to/pretrain", torch_dtype="bfloat16")
59
+ tokenizer = AutoTokenizer.from_pretrained("path/to/pretrain")
60
+ ```