ARBS / docs /true-ternary /TRUE-TERNARY-REFACTOR10.md
CLIWorks's picture
Upload folder using huggingface_hub
d8bc908 verified

True Ternary Refactor 10 — Sidecar Int8 Verification And Full-System Smoke

Scope

The sidecar model files are now available locally:

  • arbitor/encoders/models/dinov2-small
  • arbitor/encoders/models/moonshine-base
  • arbitor/encoders/models/pig-vae/model.safetensors

This pass verifies int8 quantization for the available sidecars and checks the full multimodal path through the Triton-backed ternary system.

Sidecar Quantization Metadata

Added explicit metadata to quantized imported sidecars:

_arb_quantize_requested
_arb_quantized
_arb_quantized_int8

The helper also freezes every sidecar parameter after quantization.

Applied to:

  • ImageSequencer.vit
  • AudioSequencer.audio_encoder
  • pig_vae loader paths

Verified Int8 Sidecars

ImageSequencer(quantize_weights='int8'):

image quant_requested= int8
image quantized_int8= True
image quantized= True
image trainable= 0
image quant_classes= {'QConv2d': 1, 'QLinear': 72}

AudioSequencer(quantize_weights='int8'):

audio quant_requested= int8
audio quantized_int8= True
audio quantized= True
audio trainable= 0
audio quant_classes= {'QLinear': 128}

Focused forward smokes:

image_forward_ok (1, 254, 512) True
audio_forward_ok (1, 36, 512) True

Full Multimodal CUDA Smoke

ARBModel(enable_image=True, enable_audio=True, enable_vq=True, enable_graph=True, enable_memory_modules=True, enable_moe=True):

image_quantized_int8 True
audio_quantized_int8 True
full_multimodal_cuda_train_smoke_ok logits=(1, 8, 297), targets=(1, 7), indices=(1, 298), loss=17.4929

This exercised:

  • int8 DINO/ViT sidecar
  • int8 Moonshine sidecar
  • ternary image/audio projections
  • multimodal VQ bridge
  • graph path with Triton aggregation/gather kernels
  • MoE path with Triton dense combine kernel
  • output router / byte head
  • backward and _ternary_update_memory()

Full model audit with image/audio enabled:

logical ternary weights: 42,669,632
ternary training state: 55.72 MB
trainable float params: 0 tensors, 0.00 MB
frozen float params: 433 tensors, 318.80 MB
float buffers: 406 tensors, 0.00 MB

The frozen float params belong to imported sidecars. Their compute modules were verified as Quanto int8 wrappers where supported by the active environment.

pig-vae Status

pig-vae/model.safetensors is present locally, and the loader now applies the same quantization metadata/freeze path as the vision/audio sidecars.

Runtime verification is blocked in this Python environment because diffusers is not installed:

RuntimeError: pig-vae requires the optional diffusers dependency.

The system Python is externally managed, so I did not force-install packages into it. To verify pig-vae in this checkout, create/use a project venv and install:

pip install -e .[diffusers]

Then run:

python - <<'PY'
from arbitor.encoders.pig_vae import load_vae
vae = load_vae(device='cpu', quantize='int8')
print(vae.vae._arb_quantized_int8)
PY

Kernel Coverage

Current Triton-backed full-system paths:

  • packed ternary linear forward/backward/update
  • packed ternary embedding forward/backward/update
  • ternary RMSNorm
  • E residual update
  • T_accum update
  • Graph edge weighting + target aggregation
  • Graph VQ-index gather + residual add
  • MoE dense route combine
  • VideoHead denoise update

Remaining non-fused control loops:

  • VideoHead diffusion/halting loop
  • Graph hop loop
  • MoE ACT iteration loop

Those loops control repeated computation and halting. They are supported by Triton kernels internally, but they are not yet persistent monolithic kernels.

Verification

  • python -m py_compile arbitor/components.py arbitor/sequencers.py arbitor/encoders/audio.py arbitor/encoders/pig_vae.py arbitor/main.py arbitor/vq.py arbitor/kernel/ternary_scale.py arbitor/kernel/ternary_audit.py
  • python -m pytest -q testing/test_tscale.py -k "cuda_triton_correctness_update_E or cuda_triton_tscale_path": 2 passed
  • full multimodal CUDA smoke passed with image/audio sidecars enabled