Mid-Training - Phase 006: Smoltalk2 (No Thinking)

#5
by mrs83 - opened
ethicalabs.ai org

30M tokens, 4500 steps

Screenshot_2026-02-23_23-23-48

ethicalabs.ai org
Tasks Version Filter n-shot Metric Value Stderr
arc_easy 1 none 0 acc โ†‘ 0.4520 ยฑ 0.0102
none 0 acc_norm โ†‘ 0.3889 ยฑ 0.0100
hellaswag 1 none 0 acc โ†‘ 0.2891 ยฑ 0.0045
none 0 acc_norm โ†‘ 0.3071 ยฑ 0.0046
piqa 1 none 0 acc โ†‘ 0.6219 ยฑ 0.0113
none 0 acc_norm โ†‘ 0.6034 ยฑ 0.0114
sciq 1 none 0 acc โ†‘ 0.7180 ยฑ 0.0142
none 0 acc_norm โ†‘ 0.6190 ยฑ 0.0154
truthfulqa_mc1 2 none 0 acc โ†‘ 0.2729 ยฑ 0.0156
truthfulqa_mc2 3 none 0 acc โ†‘ 0.4246 ยฑ 0.0154
winogrande 1 none 0 acc โ†‘ 0.5091 ยฑ 0.0141
ethicalabs.ai org
โ€ข
edited Feb 25

Last checkpoint (DPO testing) - Phase 6.1 (post DPO test)

ethicalabs@pop-os:~/Workspace/Echo-DSRN$ uv run lm_eval --model hf   --model_args pretrained=models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO,trust_remote_code=True,device_map="auto"   --tasks truthfulqa_mc1,truthfulqa_mc2,hellaswag,arc_easy,winogrande,piqa,sciq --output_path ./results_sft_smoltalk_phase6.1 --batch_size 4
2026-02-24:02:14:11 INFO     [__main__:465] Selected Tasks: ['truthfulqa_mc1', 'truthfulqa_mc2', 'hellaswag', 'arc_easy', 'winogrande', 'piqa', 'sciq']
2026-02-24:02:14:11 INFO     [evaluator:202] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
2026-02-24:02:14:11 INFO     [evaluator:240] Initializing hf model, with arguments: {'pretrained': 'models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO', 'trust_remote_code': True,
        'device_map': 'auto'}
2026-02-24:02:14:12 INFO     [models.huggingface:158] Using device 'cuda'
2026-02-24:02:14:12 INFO     [models.huggingface:545] Model type cannot be determined. Using default model type 'causal'
2026-02-24:02:14:12 INFO     [models.huggingface:426] Model parallel was set to False.
2026-02-24:02:14:28 INFO     [tasks:695] Selected tasks:
2026-02-24:02:14:28 INFO     [tasks:686] Task: sciq (sciq/sciq.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: piqa (piqa/piqa.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: winogrande (winogrande/default.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: arc_easy (arc/arc_easy.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: hellaswag (hellaswag/hellaswag.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: truthfulqa_mc2 (truthfulqa/truthfulqa_mc2.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: truthfulqa_mc1 (truthfulqa/truthfulqa_mc1.yaml)
2026-02-24:02:14:28 INFO     [api.task:434] Building contexts for sciq on rank 0...
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1000/1000 [00:00<00:00, 2406.38it/s]
2026-02-24:02:14:29 INFO     [api.task:434] Building contexts for piqa on rank 0...
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1838/1838 [00:00<00:00, 4130.78it/s]
2026-02-24:02:14:29 INFO     [api.task:434] Building contexts for winogrande on rank 0...
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 1267/1267 [00:00<00:00, 304398.17it/s]
2026-02-24:02:14:29 INFO     [api.task:434] Building contexts for arc_easy on rank 0...
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 2376/2376 [00:00<00:00, 4372.78it/s]
2026-02-24:02:14:30 INFO     [api.task:434] Building contexts for hellaswag on rank 0...
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 10042/10042 [00:01<00:00, 7852.27it/s]
2026-02-24:02:14:31 INFO     [api.task:434] Building contexts for truthfulqa_mc2 on rank 0...
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 817/817 [00:00<00:00, 2516.28it/s]
2026-02-24:02:14:32 INFO     [api.task:434] Building contexts for truthfulqa_mc1 on rank 0...
100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 817/817 [00:00<00:00, 2577.81it/s]
2026-02-24:02:14:32 INFO     [evaluator:574] Running loglikelihood requests
Running loglikelihood requests: 100%|โ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆโ–ˆ| 69875/69875 [42:31<00:00, 27.38it/s]
2026-02-24:02:57:17 INFO     [loggers.evaluation_tracker:209] Saving results aggregated
hf (pretrained=models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO,trust_remote_code=True,device_map=auto), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 4
Tasks Version Filter n-shot Metric Value Stderr
arc_easy 1 none 0 acc โ†‘ 0.4689 ยฑ 0.0102
none 0 acc_norm โ†‘ 0.4158 ยฑ 0.0101
hellaswag 1 none 0 acc โ†‘ 0.2915 ยฑ 0.0045
none 0 acc_norm โ†‘ 0.3190 ยฑ 0.0047
piqa 1 none 0 acc โ†‘ 0.6306 ยฑ 0.0113
none 0 acc_norm โ†‘ 0.6143 ยฑ 0.0114
sciq 1 none 0 acc โ†‘ 0.7520 ยฑ 0.0137
none 0 acc_norm โ†‘ 0.6780 ยฑ 0.0148
truthfulqa_mc1 2 none 0 acc โ†‘ 0.2411 ยฑ 0.0150
truthfulqa_mc2 3 none 0 acc โ†‘ 0.4251 ยฑ 0.0151
winogrande 1 none 0 acc โ†‘ 0.5122 ยฑ 0.0140
ethicalabs.ai org
โ€ข
edited Feb 25

Phase 6.2 - Smoltalk2 (No Think, Mid Training)

ethicalabs@pop-os:~/Workspace/Echo-DSRN$ uv run python -m echo_hf.talk --model_path outputs/phase7_sft/checkpoint-2500/ --chat --temperature 0.2
Using device: cuda
Loading model from outputs/phase7_sft/checkpoint-2500/...
Detected LoRA adapter at outputs/phase7_sft/checkpoint-2500/
Loading base model from models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO/...
The module name  (originally ) is not a valid Python identifier. Please rename the original module to avoid import issues.
Loading adapter...
Loading tokenizer from models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO/...
Starting interactive chat with outputs/phase7_sft/checkpoint-2500/...
Type 'exit' or 'quit' to stop.
Type 'reset' to clear conversation history.

You: Who are you?

Echo: /home/ethicalabs/Workspace/Echo-DSRN/echo_hf/modeling_echo.py:338: UserWarning: Flash Efficient attention on Current AMD GPU is still experimental. Enable it with TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1. (Triggered internally at /pytorch/aten/src/ATen/native/transformers/hip/sdp_utils.cpp:323.)
  y = attn_fn(q, k, v, is_causal=is_causal)
/home/ethicalabs/Workspace/Echo-DSRN/echo_hf/modeling_echo.py:338: UserWarning: Mem Efficient attention on Current AMD GPU is still experimental. Enable it with TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1. (Triggered internally at /pytorch/aten/src/ATen/native/transformers/hip/sdp_utils.cpp:383.)
  y = attn_fn(q, k, v, is_causal=is_causal)

I am Kurtis-EON1, an AI developed by ethicalabs.ai. I'm a creative assistant designed to help people with artistic and technical needs. We craft high-quality art that reflects the unique qualities of each person we serve. Our goal is to make everyday objects feel more authentic and engaging than ever before.

We're here to support artists in their journey towards self-expression and transformation. What do you think about creating art that feels like it belongs on your shelves or at home?
ethicalabs.ai org

Phase 6.3 - Smoltalk2 (No Think, Mid Training)

Screenshot_2026-02-25_20-45-42

Tasks Version Filter n-shot Metric Value Stderr
arc_easy 1 none 0 acc โ†‘ 0.4764 ยฑ 0.0102
none 0 acc_norm โ†‘ 0.4306 ยฑ 0.0102
hellaswag 1 none 0 acc โ†‘ 0.2914 ยฑ 0.0045
none 0 acc_norm โ†‘ 0.3164 ยฑ 0.0046
piqa 1 none 0 acc โ†‘ 0.6289 ยฑ 0.0113
none 0 acc_norm โ†‘ 0.6202 ยฑ 0.0113
sciq 1 none 0 acc โ†‘ 0.7620 ยฑ 0.0135
none 0 acc_norm โ†‘ 0.6680 ยฑ 0.0149
truthfulqa_mc1 2 none 0 acc โ†‘ 0.2387 ยฑ 0.0149
truthfulqa_mc2 3 none 0 acc โ†‘ 0.4282 ยฑ 0.0152
winogrande 1 none 0 acc โ†‘ 0.5067 ยฑ 0.0141

Sign up or log in to comment