lewm-models / docs /training-status.md
eren23
Initial: LeWM model collection with full quantization documentation
6cdcc30

Training Status

Training is ongoing at W&B project eren23/crucible-lewm.

Current Status

Variant Epoch safetensors LQ40 export Benchmarked Convergence
baseline 192d/6e/6p 100+ available βœ“ (full, Q4 pred) βœ“ βœ“
slim_96d/4e/4p 1 βœ“ βœ“ (full, q4, f32) βœ“ βœ—
hybrid_ALAL_64d/4e/4p 1 pending βœ“ (full) partial βœ—
slim_48d/2e/2p 1 pending pending βœ— βœ—
slim_64d/3e/3p 1 pending pending βœ— βœ—
slim_96d/2e/3p 1 pending pending βœ— βœ—
slim_128d/4e/4p 1 pending pending βœ— βœ—
slim_192d/4e/4p 1 pending pending βœ— βœ—
elastic_fixed100 1 pending pending βœ— βœ—

Expected Quality Improvement

The baseline expert (100+ epochs) achieves cos=0.999 vs f32 at INT8+Q4. All slim variants are currently epoch 1.

Conservative estimate for epoch 100 slim variants:

  • slim_96d/4e/4p: cos may improve from 0.9982 β†’ 0.999+
  • hybrid_ALAL: cos may improve from ~0.98 β†’ ~0.995+
  • WANDA variants: would benefit most from longer training + fine-tuning

Benchmarking Checklist

When new checkpoints complete training:

  • Download from W&B
  • Convert to safetensors + config.json
  • Benchmark f32 on Apple Silicon (encode, predict, 20-step rollout)
  • Export to LQ40 (full, q4-pred, f32)
  • Benchmark INT8+Q4 on Apple Silicon
  • Benchmark on ESP32-P4 (if hardware available)
  • Run WASM browser benchmark
  • Compare cos vs baseline expert
  • Update model README with results
  • Push to HuggingFace

How to Contribute Benchmarks

If you run these models on different hardware, please open an issue or PR with:

  1. Hardware platform
  2. Software version (Synapse commit)
  3. Benchmark results
  4. Any issues or observations