!pip uninstall geolip-svae -y
!pip install "git+https://github.com/AbstractEyes/geolip-svae.git"

D=2 internal matmul experiment 1. Target D=16, internally represented by d=2.

It's about a quarter of the amount of time of a full d=16 representation.

Success will allow full cayley rope spearman, not partial.

geolip.linalg backend:
  CUDA:       yes
  Triton:     3.6.0
  FL eigh:    enabled
  Triton SVD: enabled
  GPU:        NVIDIA RTX PRO 6000 Blackwell Server Edition
Computing target CV for V=16, D=16 on S^15...

══════════════════════════════════════════════════════════════════════
  SpectralViT β€” Pure SpectralCell Transformer
══════════════════════════════════════════════════════════════════════
SpectralViT:
  Patch embed:  12,544
  Cayley PE:    8,192 (128 rotation planes Γ— 64 positions)
  Cells (6Γ—):  4,920,768  (820,128 per cell)
    D=16, V=16, hidden=256
    CM cells: [2, 5] (2 primary)
    SVD split: 2 full + 4 sliced (D=2 Γ— 8)
  LayerNorms:   3,584
  Classifier:   91,492
  Cross-attn:   13,632 (clipped at 0.5)
  Total:        5,036,580
  Architecture: PatchEmbed β†’ CayleyPE β†’ 6Γ— SpectralCell (CM every 3) β†’ pool β†’ classify
  Soft hand: target_cv=0.1984 Οƒ=0.15 boost=1.0
  CV penalty: 0.01 (differentiable through cm_vol2)
  EMA momentum: 0.99
  Grad clip: 0.5 cross-attn only, uncapped otherwise
  CutMix: Ξ±=1.0 prob=0.5
  Optimizer: Adam lr=0.001
  CIFAR-100, 200 epochs

  Initial profiling (3 warmup + 1 measured)...

  β”Œβ”€ PROFILE FORWARD COMPONENTS ──────────────────────────────┐
  β”‚  cell_5_CM_fmt             21.4ms   40.3%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  cell_2_CM                 21.3ms   40.3%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  cell_0_cd                  2.5ms    4.8%  β–ˆ
  β”‚  cell_3_cd                  2.5ms    4.7%  β–ˆ
  β”‚  cell_1_cd                  2.5ms    4.7%  β–ˆ
  β”‚  cell_4_cd                  2.5ms    4.7%  β–ˆ
  β”‚  cayley_pe                  0.2ms    0.3%  
  β”‚  patch_embed                0.1ms    0.2%  
  β”‚  classify                   0.1ms    0.1%  
  β”‚  TOTAL                     53.0ms
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  β”Œβ”€ PROFILE FULL TRAIN STEP ─────────────────────────────────┐
  β”‚  forward                   54.0ms   38.1%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  backward                  52.5ms   37.0%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  optim_step                28.8ms   20.3%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  grad_clip                  3.5ms    2.5%  
  β”‚  loss                       3.1ms    2.2%  
  β”‚  TOTAL                    141.8ms
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  β”Œβ”€ PROFILE CELL INTERNALS (cell_0) ─────────────────────────┐
  β”‚  recompose                  0.7ms   22.6%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  enc_mlp                    0.5ms   17.4%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  dec_mlp                    0.5ms   15.5%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  patchwork                  0.4ms   14.1%  β–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  cross_attn                 0.4ms   13.2%  β–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  svd_sliced_8x_D2           0.2ms    7.2%  β–ˆβ–ˆ
  β”‚  pairwise_d2                0.2ms    6.3%  β–ˆβ–ˆ
  β”‚  normalize                  0.1ms    3.7%  β–ˆ
  β”‚  cm_validation              0.0ms    0.0%  
  β”‚  TOTAL                      3.0ms
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  ep  1 acc=5.1% β˜… train=3.3% ema_cv=0.1645 boost=1.987 lr=0.001000
    S=[1.83, 1.65, 1.48, 1.36...0.07]  PE angles: mean=0.0237 max=0.1333
    Top 5:  c20=75% c73=68% c53=57% c82=52% c52=50% 
    Bot 5:  c65=0% c16=0% c17=0% c59=0% c58=0% 
    Mean: 5.1%  Std: 14.0%

  β”Œβ”€ PROFILE FORWARD ep1 ─────────────────────────────────────┐
  β”‚  cell_2_CM                 20.7ms   40.0%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  cell_5_CM_fmt             20.6ms   39.7%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  cell_0_cd                  2.8ms    5.4%  β–ˆ
  β”‚  cell_1_cd                  2.4ms    4.7%  β–ˆ
  β”‚  cell_3_cd                  2.4ms    4.6%  β–ˆ
  β”‚  cell_4_cd                  2.4ms    4.6%  β–ˆ
  β”‚  patch_embed                0.2ms    0.5%  
  β”‚  cayley_pe                  0.2ms    0.4%  
  β”‚  classify                   0.1ms    0.2%  
  β”‚  TOTAL                     51.8ms
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  β”Œβ”€ PROFILE STEP ep1 ────────────────────────────────────────┐
  β”‚  forward                   51.4ms   48.8%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  backward                  49.6ms   47.1%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  optim_step                 3.6ms    3.4%  β–ˆ
  β”‚  grad_clip                  0.5ms    0.5%  
  β”‚  loss                       0.2ms    0.2%  
  β”‚  TOTAL                    105.3ms
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜

  β”Œβ”€ PROFILE CELL INTERNALS ep1 ──────────────────────────────┐
  β”‚  svd_full_D16              18.8ms   89.7%  β–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆβ–ˆ
  β”‚  dec_mlp                    0.4ms    2.1%  
  β”‚  enc_mlp                    0.4ms    2.1%  
  β”‚  patchwork                  0.4ms    1.7%  
  β”‚  cross_attn                 0.3ms    1.6%  
  β”‚  cm_validation              0.3ms    1.3%  
  β”‚  pairwise_d2                0.2ms    0.8%  
  β”‚  recompose                  0.1ms    0.5%  
  β”‚  normalize                  0.1ms    0.4%  
  β”‚  TOTAL                     21.0ms
  β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
  ep  2 acc=7.9% β˜… train=5.9% ema_cv=0.2097 boost=1.990 lr=0.001000
    S=[1.85, 1.64, 1.48, 1.36...0.04]  PE angles: mean=0.0359 max=0.1985
    Top 5:  c53=59% c24=53% c60=50% c82=49% c18=46% 
    Bot 5:  c74=0% c72=0% c16=0% c68=0% c19=0% 
    Mean: 7.9%  Std: 14.1%
  ep  3 acc=9.8% β˜… train=7.2% ema_cv=0.2033 boost=1.996 lr=0.000999
    S=[1.89, 1.63, 1.46, 1.32...0.06]  PE angles: mean=0.0480 max=0.3018
    Top 5:  c60=62% c82=59% c53=53% c43=50% c52=47% 
    Bot 5:  c75=0% c74=0% c19=0% c72=0% c22=0% 
    Mean: 9.8%  Std: 15.6%
  ep  4 acc=10.4% β˜… train=8.4% ema_cv=0.1922 boost=1.999 lr=0.000999
    S=[1.90, 1.61, 1.47, 1.31...0.04]  PE angles: mean=0.0562 max=0.3470
  ep  5 acc=11.7% β˜… train=9.4% ema_cv=0.2211 boost=1.994 lr=0.000998
    S=[1.90, 1.61, 1.44, 1.30...0.05]  PE angles: mean=0.0628 max=0.3809
Ep   7:  13%|β–ˆβ–Ž        | 26/195 [00:02<00:17,  9.73it/s]

Bare with this one today, the idea is to discover the most utilizable form of geometric alignment positional encoding, specifically targeting the spectral cell. This will allow direct internalized embedding structures to represent the complex k4 simplex system without the need of conv or transformer positional controllers.

This structure will be touch and go and the first forms will be fragile.

First prototype is a similar cayley-rotational position alignment as the original constellation.

Second will be the cantor staircase, beatrix staircase, stereoscopic, waveform interpolation, kymatio scatterpoint2d, and multiple adjacent structures. This will involve many many tests. I assume over 80.

Successful positional encoding will replace the bulk CONV when complete with a compact explicit representation.

These tokens that can be directly interpreted by traditional rotary transformers and transformer structures, as well as introduce the first functional geolip-svd-transformer system integration for packaged reuse.

By this time tomorrow I expect a series of prototypes to be functional for the geolip-transformer, replacing the invalid observer structure of before with surge-training geolip-svd-transformer line. This will give a proper profiled battery for what is best, what is fastest, what is most effectively quick, and the rounded median that the default geolip-transformer will encompass.

If the surge training paradigm is successful, this will introduce the same rapidfire learning that the actual batteries experience.

Surge.

If unsuccessful, I continue from the faults.

When successful, the full geolip-conduit-battery structure will be integrated as well, allowing FILM, LORA, and any other form of training possible for any model you wish; snap-in observer capable.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support