andandandand's picture
Deploy cold-start reliability update (source: 85cf4fa)
d110c29 verified
metadata
title: 'HyperView: Jaguar Embedding Geometry Comparison'
emoji: 🐆
colorFrom: green
colorTo: yellow
sdk: docker
app_port: 7860
pinned: false

HyperView Jaguar Core Claims Demo

This Space compares the top core-claims-set families in three geometric panels:

  1. Euclidean: triplet:T0:msv3 (seed 43)
  2. Hyperspherical view: arcface:O0:msv3 (seed 44)
  3. Hyperbolic (Poincare) view: lorentz:O1:msv3 (seed 44)

The app loads train + validation-tagged samples from a resized Hugging Face dataset and injects precomputed embedding assets generated offline on GPU.

Contracts

Runtime environment variables:

  • HF_DATASET_REPO (default: hyper3labs/jaguar-hyperview-demo)
  • HF_DATASET_CONFIG (default: default)
  • HF_DATASET_SPLIT (default: train)
  • EMBEDDING_ASSET_DIR (default: ./assets)
  • EMBEDDING_ASSET_MANIFEST (default: ${EMBEDDING_ASSET_DIR}/manifest.json)
  • HYPERVIEW_STARTUP_MODE (default: serve_fast; choices: serve_fast|blocking)
  • HYPERVIEW_WARMUP_STATUS_PATH (default: /tmp/hyperview_warmup_status.json)
  • HYPERVIEW_WARMUP_FAILURE_POLICY (default: exit; choices: exit|warn)
  • HYPERVIEW_BATCH_INSERT_SIZE (default: 500; controls sample-batch insertion chunk size)
  • HYPERVIEW_DEFAULT_PANEL (default: spherical3d; enables Sphere 3D as initial scatter panel)
  • HYPERVIEW_LAYOUT_CACHE_VERSION (default: v6; bumps dock layout localStorage key to invalidate stale cached panel state)
  • HYPERVIEW_BIND_HOST (preferred bind host; optional)
  • SPACE_HOST (compat input only; used for bind only if local: 0.0.0.0, 127.0.0.1, localhost, ::, ::1)
  • SPACE_PORT (primary port source)
  • PORT (fallback port source when SPACE_PORT is unset)

Port precedence: SPACE_PORT > PORT > 7860.

On Hugging Face Spaces, SPACE_HOST may be injected as <space-subdomain>.hf.space. That domain must not be used as a local bind socket, so the runtime falls back to 0.0.0.0 unless HYPERVIEW_BIND_HOST is explicitly set.

The runtime also patches HyperView's dock-layout cache key from legacy hyperview:dockview-layout:v5 to hyperview:dockview-layout:${HYPERVIEW_LAYOUT_CACHE_VERSION} to force migration away from stale panel layouts after UI/layout changes. For future migrations, increment HYPERVIEW_LAYOUT_CACHE_VERSION (for example, v7) without changing code.

Startup and Warmup Semantics

  • HYPERVIEW_STARTUP_MODE=serve_fast (default):
    • Starts the HyperView server immediately.
    • Runs dataset warmup asynchronously in a background thread.
    • Warmup phases are persisted as JSON: ingest -> spaces -> layouts -> ready.
  • HYPERVIEW_STARTUP_MODE=blocking:
    • Performs warmup synchronously before serving traffic.

Warmup status JSON fields include:

  • status (starting|running|ready|failed)
  • phase (boot|ingest|spaces|layouts|ready|failed)
  • counts (sample/space/layout counters and ingestion stats)
  • error (exception payload when warmup fails)
  • timestamps (started_at, updated_at, plus terminal timestamps)

Failure policy behavior:

  • HYPERVIEW_WARMUP_FAILURE_POLICY=exit (default): process exits on warmup failure.
  • HYPERVIEW_WARMUP_FAILURE_POLICY=warn: process stays up and records failure in warmup status JSON.

Healthcheck semantics:

  • Container health (/__hyperview__/health) indicates server liveness only.
  • Data readiness (dataset/spaces/layouts completed) is indicated by warmup status JSON (status=ready).

Important Note

HyperView similarity search currently uses cosine distance in storage backends. The Lorentz panel in this Space is intended for embedding-space visualization and geometry-aware comparison rather than canonical Lorentz-distance retrieval scoring.

Reproducibility Commands

Run from this folder (HyperViewDemoHuggingFaceSpace/).

1) Build embedding assets (GPU required)

source .venv/bin/activate
python3 scripts/build_hyperview_demo_assets.py \
  --model_manifest config/model_manifest.json \
  --dataset_root ../kaggle_jaguar_dataset_v2 \
  --coreset_csv ../data/validation_coreset.csv \
  --output_dir ./assets \
  --device cuda \
  --batch_size 64 \
  --num_workers 4

2) Publish resized demo dataset

source .venv/bin/activate
python3 scripts/publish_hyperview_demo_dataset.py \
  --dataset_root ../kaggle_jaguar_dataset_v2 \
  --coreset_csv ../data/validation_coreset.csv \
  --output_dir ./dataset_build \
  --repo_id hyper3labs/jaguar-hyperview-demo \
  --config_name default

Use --no_push for local dry-runs.

3) Local Docker smoke run

docker build -t jaguar-hyperview .
docker run --rm -p 7860:7860 \
  -e HF_DATASET_REPO=hyper3labs/jaguar-hyperview-demo \
  -e EMBEDDING_ASSET_DIR=/home/user/app/assets \
  jaguar-hyperview

Open http://127.0.0.1:7860.

4) Optional H100 batch export on HPI

sbatch remote_setup/build_hyperview_demo_assets_h100.slurm

Override defaults at submit time if needed:

MODEL_MANIFEST=config/model_manifest.json \
OUTPUT_DIR=./assets \
sbatch remote_setup/build_hyperview_demo_assets_h100.slurm

Provenance

Model manifest: config/model_manifest.json

Ranking and source-of-truth anchors:

  • reports/summaries_of_findings/core_claims_axis12_paper_facing_tables_2026_03_16_102311/axis1_primary_ranking.csv
  • paper_draft/second_draft/sources_of_truth.md