ethicalabs
/

Kurtis-EON1

Text Generation

text-generation-inference

🇪🇺 Region: EU

Model card Files Files and versions

Mid-Training - Phase 006: Smoltalk2 (No Thinking)

#5

by mrs83 - opened Feb 23

ethicalabs.ai org Feb 23

30M tokens, 4500 steps

ethicalabs.ai org Feb 23

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
arc_easy	1	none	0	acc	↑	0.4520	±	0.0102
		none	0	acc_norm	↑	0.3889	±	0.0100
hellaswag	1	none	0	acc	↑	0.2891	±	0.0045
		none	0	acc_norm	↑	0.3071	±	0.0046
piqa	1	none	0	acc	↑	0.6219	±	0.0113
		none	0	acc_norm	↑	0.6034	±	0.0114
sciq	1	none	0	acc	↑	0.7180	±	0.0142
		none	0	acc_norm	↑	0.6190	±	0.0154
truthfulqa_mc1	2	none	0	acc	↑	0.2729	±	0.0156
truthfulqa_mc2	3	none	0	acc	↑	0.4246	±	0.0154
winogrande	1	none	0	acc	↑	0.5091	±	0.0141

ethicalabs.ai org Feb 24

•

Last checkpoint (DPO testing) - Phase 6.1 (post DPO test)

ethicalabs@pop-os:~/Workspace/Echo-DSRN$ uv run lm_eval --model hf   --model_args pretrained=models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO,trust_remote_code=True,device_map="auto"   --tasks truthfulqa_mc1,truthfulqa_mc2,hellaswag,arc_easy,winogrande,piqa,sciq --output_path ./results_sft_smoltalk_phase6.1 --batch_size 4
2026-02-24:02:14:11 INFO     [__main__:465] Selected Tasks: ['truthfulqa_mc1', 'truthfulqa_mc2', 'hellaswag', 'arc_easy', 'winogrande', 'piqa', 'sciq']
2026-02-24:02:14:11 INFO     [evaluator:202] Setting random seed to 0 | Setting numpy seed to 1234 | Setting torch manual seed to 1234 | Setting fewshot manual seed to 1234
2026-02-24:02:14:11 INFO     [evaluator:240] Initializing hf model, with arguments: {'pretrained': 'models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO', 'trust_remote_code': True,
        'device_map': 'auto'}
2026-02-24:02:14:12 INFO     [models.huggingface:158] Using device 'cuda'
2026-02-24:02:14:12 INFO     [models.huggingface:545] Model type cannot be determined. Using default model type 'causal'
2026-02-24:02:14:12 INFO     [models.huggingface:426] Model parallel was set to False.
2026-02-24:02:14:28 INFO     [tasks:695] Selected tasks:
2026-02-24:02:14:28 INFO     [tasks:686] Task: sciq (sciq/sciq.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: piqa (piqa/piqa.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: winogrande (winogrande/default.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: arc_easy (arc/arc_easy.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: hellaswag (hellaswag/hellaswag.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: truthfulqa_mc2 (truthfulqa/truthfulqa_mc2.yaml)
2026-02-24:02:14:28 INFO     [tasks:686] Task: truthfulqa_mc1 (truthfulqa/truthfulqa_mc1.yaml)
2026-02-24:02:14:28 INFO     [api.task:434] Building contexts for sciq on rank 0...
100%|██████████████████████████████████████████████████████████████████████████████████████████████| 1000/1000 [00:00<00:00, 2406.38it/s]
2026-02-24:02:14:29 INFO     [api.task:434] Building contexts for piqa on rank 0...
100%|██████████████████████████████████████████████████████████████████████████████████████████████| 1838/1838 [00:00<00:00, 4130.78it/s]
2026-02-24:02:14:29 INFO     [api.task:434] Building contexts for winogrande on rank 0...
100%|████████████████████████████████████████████████████████████████████████████████████████████| 1267/1267 [00:00<00:00, 304398.17it/s]
2026-02-24:02:14:29 INFO     [api.task:434] Building contexts for arc_easy on rank 0...
100%|██████████████████████████████████████████████████████████████████████████████████████████████| 2376/2376 [00:00<00:00, 4372.78it/s]
2026-02-24:02:14:30 INFO     [api.task:434] Building contexts for hellaswag on rank 0...
100%|████████████████████████████████████████████████████████████████████████████████████████████| 10042/10042 [00:01<00:00, 7852.27it/s]
2026-02-24:02:14:31 INFO     [api.task:434] Building contexts for truthfulqa_mc2 on rank 0...
100%|████████████████████████████████████████████████████████████████████████████████████████████████| 817/817 [00:00<00:00, 2516.28it/s]
2026-02-24:02:14:32 INFO     [api.task:434] Building contexts for truthfulqa_mc1 on rank 0...
100%|████████████████████████████████████████████████████████████████████████████████████████████████| 817/817 [00:00<00:00, 2577.81it/s]
2026-02-24:02:14:32 INFO     [evaluator:574] Running loglikelihood requests
Running loglikelihood requests: 100%|██████████████████████████████████████████████████████████████| 69875/69875 [42:31<00:00, 27.38it/s]
2026-02-24:02:57:17 INFO     [loggers.evaluation_tracker:209] Saving results aggregated
hf (pretrained=models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO,trust_remote_code=True,device_map=auto), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 4

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
arc_easy	1	none	0	acc	↑	0.4689	±	0.0102
		none	0	acc_norm	↑	0.4158	±	0.0101
hellaswag	1	none	0	acc	↑	0.2915	±	0.0045
		none	0	acc_norm	↑	0.3190	±	0.0047
piqa	1	none	0	acc	↑	0.6306	±	0.0113
		none	0	acc_norm	↑	0.6143	±	0.0114
sciq	1	none	0	acc	↑	0.7520	±	0.0137
		none	0	acc_norm	↑	0.6780	±	0.0148
truthfulqa_mc1	2	none	0	acc	↑	0.2411	±	0.0150
truthfulqa_mc2	3	none	0	acc	↑	0.4251	±	0.0151
winogrande	1	none	0	acc	↑	0.5122	±	0.0140

ethicalabs.ai org Feb 25

•

Phase 6.2 - Smoltalk2 (No Think, Mid Training)

ethicalabs@pop-os:~/Workspace/Echo-DSRN$ uv run python -m echo_hf.talk --model_path outputs/phase7_sft/checkpoint-2500/ --chat --temperature 0.2
Using device: cuda
Loading model from outputs/phase7_sft/checkpoint-2500/...
Detected LoRA adapter at outputs/phase7_sft/checkpoint-2500/
Loading base model from models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO/...
The module name  (originally ) is not a valid Python identifier. Please rename the original module to avoid import issues.
Loading adapter...
Loading tokenizer from models/Echo-DSRN-Small-Kurtis-EON1-v0.4-DPO/...
Starting interactive chat with outputs/phase7_sft/checkpoint-2500/...
Type 'exit' or 'quit' to stop.
Type 'reset' to clear conversation history.

You: Who are you?

Echo: /home/ethicalabs/Workspace/Echo-DSRN/echo_hf/modeling_echo.py:338: UserWarning: Flash Efficient attention on Current AMD GPU is still experimental. Enable it with TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1. (Triggered internally at /pytorch/aten/src/ATen/native/transformers/hip/sdp_utils.cpp:323.)
  y = attn_fn(q, k, v, is_causal=is_causal)
/home/ethicalabs/Workspace/Echo-DSRN/echo_hf/modeling_echo.py:338: UserWarning: Mem Efficient attention on Current AMD GPU is still experimental. Enable it with TORCH_ROCM_AOTRITON_ENABLE_EXPERIMENTAL=1. (Triggered internally at /pytorch/aten/src/ATen/native/transformers/hip/sdp_utils.cpp:383.)
  y = attn_fn(q, k, v, is_causal=is_causal)

I am Kurtis-EON1, an AI developed by ethicalabs.ai. I'm a creative assistant designed to help people with artistic and technical needs. We craft high-quality art that reflects the unique qualities of each person we serve. Our goal is to make everyday objects feel more authentic and engaging than ever before.

We're here to support artists in their journey towards self-expression and transformation. What do you think about creating art that feels like it belongs on your shelves or at home?

ethicalabs.ai org Feb 25

Phase 6.3 - Smoltalk2 (No Think, Mid Training)

Tasks	Version	Filter	n-shot	Metric		Value		Stderr
arc_easy	1	none	0	acc	↑	0.4764	±	0.0102
		none	0	acc_norm	↑	0.4306	±	0.0102
hellaswag	1	none	0	acc	↑	0.2914	±	0.0045
		none	0	acc_norm	↑	0.3164	±	0.0046
piqa	1	none	0	acc	↑	0.6289	±	0.0113
		none	0	acc_norm	↑	0.6202	±	0.0113
sciq	1	none	0	acc	↑	0.7620	±	0.0135
		none	0	acc_norm	↑	0.6680	±	0.0149
truthfulqa_mc1	2	none	0	acc	↑	0.2387	±	0.0149
truthfulqa_mc2	3	none	0	acc	↑	0.4282	±	0.0152
winogrande	1	none	0	acc	↑	0.5067	±	0.0141

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment