# Resume ## Main conclusions from this chat 1. The main issue is data/sample construction, not just checkpoint choice. - `class_id` is token-level. - labels are context-level and depend on sampled `T_cutoff`. - balanced token classes do not imply balanced future outcomes. - the model can easily learn an over-negative prior if the cache is built naively. 2. Cache balancing must happen at cache generation time. - train-time weighting is too late to fix disk waste or missing context diversity. - the cache should control: - class balance - good/bad context balance within class 3. Validation needed token isolation, not just class balancing. - same token appearing in train and val through different contexts makes validation misleading. 4. Movement-threshold circularity is not the main blocker anymore. - movement labels are downstream of realized returns. - movement thresholds should not control cache construction. 5. OHLC is important, but current usage looks like trend/regime summary, not true pattern detection. - the chart branch is being used. - but probe tests suggest it is mostly acting as continuation/trend context. - not as breakout / support-resistance / head-and-shoulders intelligence. 6. The current code is still regression-first. - main target is multi-horizon return prediction. - there is also: - quality head - movement head - calling movement auxiliary only matters if its loss contribution is actually secondary. ## What was implemented in code ### 1. Validation split - `train.py` - validation split was changed to group by token identity (`source_token` / `token_address`) instead of only `class_id`. ### 2. Task tracker - `TASK_LIST.md` - created as the running checklist for this work. ### 3. Context weighting signal - `data/data_loader.py` - added a context-quality summary derived from realized multi-horizon returns. - current code computes: - `context_score` - `context_bucket` (`good` / `bad`) - this is used for weighting and cache metadata. ### 4. Cached dataset weighting - `data/data_loader.py` - sampler weights now account for: - class balance - good/bad context balance inside class ### 5. Weighted cache generation - `cache_dataset.py` - canonical builder now writes context cache only. - no `cache_mode`. - no `max_samples`. - cache budget is context-based: - `--target_contexts_total` - or `--target_contexts_per_class` - quota-driven acceptance is implemented before save: - level 1: class quotas - level 2: good/bad quotas within class ### 6. Metadata rebuild - `scripts/rebuild_metadata.py` - now rebuilds: - `file_class_map` - `file_context_bucket_map` - `file_context_summary_map` ### 7. `min_trades` - re-added properly - no longer a dead CLI arg - now actually controls dataset/context eligibility thresholds ### 8. Evaluate script OHLC probes - `scripts/evaluate_sample.py` - added these OHLC probe modes: - `ohlc_reverse` - `ohlc_shuffle_chunks` - `ohlc_mask_recent` - `ohlc_trend_only` - `ohlc_summary_shuffle` - `ohlc_detrend` - `ohlc_smooth` ### 9. Bad mint metadata fallback - `data/data_loader.py` - if mint metadata has invalid or zero supply/decimals, it now defaults to: - `total_supply = 1000000000000000` - `token_decimals = 6` ## Current cache-builder interface `cache_dataset.py` now uses a canonical context-cache path only. Relevant args: - `--output_dir` - `--start_date` - `--min_trade_usd` - `--min_trades` - `--context_length` - `--samples_per_token` - `--target_contexts_per_class` - `--target_contexts_total` - `--good_ratio_nonzero` - `--good_ratio_class0` - `--num_workers` - DB connection args Rules: - choose exactly one of: - `--target_contexts_per_class` - `--target_contexts_total` - if quota-driven caching is used, current implementation expects: - `--num_workers 1` `--samples_per_token` is still present. - It is not a cache budget. - It is a candidate-generation knob. - It may still be removable later if a better attempt-based planner replaces it. ## Important conceptual corrections from this chat 1. Binary context type matters operationally, but a naive binary rule is dangerous. - examples like `+1%` then `-20%` show why simplistic rules fail. - context typing should come from realized multi-horizon behavior, not one crude shortcut. 2. Patching alone does not force pattern learning. - it only makes local pattern use possible. - the model can still rely on trend shortcuts unless the representation and training setup make that harder. 3. Support/resistance may be inferable from current chart inputs in principle. - but current encoder likely compresses too early and learns an easier shortcut instead. 4. For this question, the bottleneck is not just “what loss?” - but also: - chart input representation - encoder compression - whether the model can preserve local 1s structure ## OHLC probe findings from this chat The key probe result repeated across runs was: - `ohlc_detrend` had the largest impact - `ohlc_trend_only` had the second-largest impact - `ohlc_smooth`, `shuffle`, `summary_shuffle`, `reverse`, and `mask_recent` had very small impact Interpretation: - the model is using OHLC mostly as: - broad trend/regime context - continuation-style directional signal - not as: - local chart pattern detector - support/resistance-aware trader logic - breakout/rejection/fair-value-gap style reasoning This strongly suggests many of the model’s bearish/bad predictions are driven by trend continuation behavior from the chart branch. ## What was explicitly rejected or corrected - do not keep treating `cache_mode` as a real concept - do not let `max_samples` and context-budget args overlap - do not call something “auxiliary” if its weight can dominate optimization - do not assume OHLC importance means chart-pattern understanding - do not answer architecture questions by guessing from intention; inspect actual code ## What still matters next 1. verify token-overlap assertions are enforced in train/val, not just token-grouped split logic 2. rebuild cache with the new quota-based builder and inspect actual distributions 3. update cache audit tooling 4. decide whether `samples_per_token` should be replaced by a better attempt planner 5. decide how the chart branch should evolve if the goal is trader-like 1s pattern reasoning rather than trend summary ## Short current state - cache construction is much closer to the right direction now - validation logic is less broken than before - OHLC is confirmed important - but OHLC is currently behaving more like a trend/summary branch than a technical-pattern branch - the codebase is still primarily training on return regression, with extra heads layered on top