rydlrKE commited on
Commit
7939f87
·
verified ·
1 Parent(s): dc1cc7b

Cloud Run encoder wiring + startup resilience

Browse files
Dockerfile ADDED
@@ -0,0 +1,45 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ FROM nvcr.io/nvidia/pytorch:24.10-py3
2
+
3
+ # Avoid some interactive prompts + make pip quieter/reproducible-ish
4
+ ENV DEBIAN_FRONTEND=noninteractive \
5
+ PIP_DISABLE_PIP_VERSION_CHECK=1 \
6
+ PYTHONDONTWRITEBYTECODE=1 \
7
+ PYTHONUNBUFFERED=1
8
+
9
+ # Where your code will live inside the container
10
+ WORKDIR /workspace
11
+
12
+ # System deps
13
+ RUN apt-get update && apt-get install -y --no-install-recommends \
14
+ git curl ca-certificates \
15
+ cmake build-essential \
16
+ gosu \
17
+ && rm -rf /var/lib/apt/lists/*
18
+
19
+ # Some base images ship a broken `/usr/local/bin/cmake` shim (from a partial pip install),
20
+ # which shadows `/usr/bin/cmake` and breaks builds that invoke `cmake` (e.g. MotionCorrection).
21
+ # Prefer the system cmake.
22
+ RUN rm -f /usr/local/bin/cmake || true
23
+
24
+ # Install from docker_requirements.txt: kimodo editable (-e .),
25
+ # but MotionCorrection non-editable (./MotionCorrection). The -e . line ensures [project.scripts]
26
+ # from pyproject.toml are installed (kimodo_gen, kimodo_demo, kimodo_textencoder).
27
+ # SKIP_MOTION_CORRECTION_IN_SETUP=1 so setup.py does not bundle motion_correction; it is
28
+ # installed separately from ./MotionCorrection in the requirements file (non-editable).
29
+ COPY docker_requirements.txt /workspace/docker_requirements.txt
30
+ COPY setup.py /workspace/setup.py
31
+ COPY pyproject.toml /workspace/pyproject.toml
32
+ COPY kimodo /workspace/kimodo
33
+ COPY MotionCorrection /workspace/MotionCorrection
34
+
35
+ RUN --mount=type=cache,target=/root/.cache/pip \
36
+ python -m pip install --upgrade pip \
37
+ && SKIP_MOTION_CORRECTION_IN_SETUP=1 python -m pip install -r docker_requirements.txt
38
+
39
+ # Use the docker-entrypoint script, to allow the docker to run as the actual user instead of root
40
+ COPY kimodo/scripts/docker-entrypoint.sh /usr/local/bin/docker-entrypoint
41
+ RUN chmod +x /usr/local/bin/docker-entrypoint
42
+
43
+ # Default command (change to your entrypoint if you have one)
44
+ ENTRYPOINT ["docker-entrypoint"]
45
+ CMD ["bash"]
README.md CHANGED
@@ -1,37 +1,284 @@
1
- ---
2
- title: Movimento
3
- emoji: 🎬
4
- colorFrom: blue
5
- colorTo: green
6
- sdk: gradio
7
- sdk_version: 6.14.0
8
- python_version: '3.12'
9
- app_file: app.py
10
- pinned: true
11
- license: apache-2.0
12
- short_description: Text-driven multi-character motion planning workspace
13
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
14
 
15
- Movimento is a hackathon Space for multi-character motion planning and orchestration.
16
-
17
- This Space currently runs a lightweight but feature-complete frontend shell for planning, execution trace, and playback controls.
18
-
19
- Implemented pipeline milestones:
20
- - Card 0: environment readiness gate
21
- - Card 1: scope lock
22
- - Card 2: service contracts
23
- - Card 3: shared state deterministic loop
24
- - Card 4: Qwen planner adapter
25
- - Card 5: BONES-SEED ingestion flow
26
- - Card 6: script-to-Kimodo mapping
27
- - Card 7: blend quality guardrails
28
- - Card 8: multi-character scheduler runtime
29
- - Card 9: AMD runtime bootstrap and health checks
30
- - Card 10: Gradio Space frontend shell
31
-
32
- Next milestone:
33
- - Card 11: notebook workflow and research pack
34
-
35
- Runtime notes:
36
- - HF bucket data is available for assets and repo snapshots.
37
- - STL meshes are hosted in dataset `lablab-ai-amd-developer-hackathon/movimento-stl-assets`.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ <p align="center">
2
+ <img src="./assets/banner.png" alt="Banner" width="100%">
3
+ <a href="LICENSE"><img src="https://img.shields.io/badge/License-Apache%202.0-76B900.svg" alt="License"></a>
4
+ <a href="https://research.nvidia.com/labs/sil/projects/kimodo/"><img src="https://img.shields.io/badge/Project-Page-blue" alt="Project Page"></a>
5
+ <a href="https://research.nvidia.com/labs/sil/projects/kimodo/docs/index.html"><img src="https://img.shields.io/badge/docs-online-green.svg" alt="Documentation"></a>
6
+ </p>
7
+
8
+ ## Overview
9
+
10
+ Kimodo is a **ki**nematic **mo**tion **d**iffusi**o**n model trained on a large-scale (700 hours) commercially-friendly optical motion capture dataset. The model generates high-quality 3D human and robot motions, and is controlled through text prompts and an extensive set of constraints such as full-body pose keyframes, end-effector positions/rotations, 2D paths, and 2D waypoints. Full details of the model architecture and training are available in the [technical report](https://research.nvidia.com/labs/sil/projects/kimodo/assets/kimodo_tech_report.pdf).
11
+
12
+ This repository provides:
13
+ - **Inference**: code and CLI to generate motions on both human and robot skeletons
14
+ - **Interactive Demo**: easily author motions with a timeline interface of text prompts and kinematic controls
15
+ - **Annotations**: [additional text descriptions](https://huggingface.co/datasets/nvidia/SEED-Timeline-Annotations) for the [BONES-SEED](https://huggingface.co/datasets/bones-studio/seed) dataset, including fine-grained temporal descriptions
16
+ - _[Coming Soon]_ **Benchmark**: test cases and evaluation code built on the [BONES-SEED](https://huggingface.co/datasets/bones-studio/seed) dataset to evaluate motion generation models based on text and constraint-following abilities
17
+
18
+ <div align="center">
19
+ <img src="assets/teaser.gif" width="1280">
20
+ </div>
21
+
22
+ ## News
23
+
24
+ See the [full changelog](CHANGELOG.md) for a detailed list of all changes.
25
+
26
+ - **[2026-03-19]** **Breaking:** Model inputs/outputs now use the SOMA 77-joint skeleton (`somaskel77`).
27
+ - **[2026-03-16]** Initial open-source release of Kimodo with five model variants (SOMA, G1, SMPL-X), CLI, interactive demo, and timeline annotations for BONES-SEED.
28
+
29
+
30
+ ## Kimodo Models
31
+
32
+ Several variations of Kimodo-v1 are available trained on various skeletons and datasets. All models support text-to-motion and kinematic controls.
33
+
34
+ > Note: models will be downloaded automatically when attempting to generate from the CLI or Interactive Demo, so there is no need to download them manually
35
+
36
+ | Model | Skeleton | Training Data | Release Date | Hugging Face | License |
37
+ |:-------|:-------------|:------:|:------:|:-------------:|:-------------:|
38
+ | **Kimodo-SOMA-RP-v1** | [SOMA](https://github.com/NVlabs/SOMA-X) | [Bones Rigplay 1](https://bones.studio/datasets#rp01) | March 16, 2026 | [Link](https://huggingface.co/nvidia/Kimodo-SOMA-RP-v1) | [NVIDIA Open Model](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/) |
39
+ | **Kimodo-G1-RP-v1** | [Unitree G1](https://github.com/unitreerobotics/unitree_mujoco/tree/main/unitree_robots/g1) | [Bones Rigplay 1](https://bones.studio/datasets#rp01) | March 16, 2026 | [Link](https://huggingface.co/nvidia/Kimodo-G1-RP-v1) | [NVIDIA Open Model](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/) |
40
+ | **Kimodo-SOMA-SEED-v1** | [SOMA](https://github.com/NVlabs/SOMA-X) | [BONES-SEED](https://huggingface.co/datasets/bones-studio/seed) | March 16, 2026 | [Link](https://huggingface.co/nvidia/Kimodo-SOMA-SEED-v1) | [NVIDIA Open Model](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/) |
41
+ | **Kimodo-G1-SEED-v1** | [Unitree G1](https://github.com/unitreerobotics/unitree_mujoco/tree/main/unitree_robots/g1) | [BONES-SEED](https://huggingface.co/datasets/bones-studio/seed) | March 16, 2026 | [Link](https://huggingface.co/nvidia/Kimodo-G1-SEED-v1) | [NVIDIA Open Model](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-open-model-license/) |
42
+ | **Kimodo-SMPLX-RP-v1** | [SMPL-X](https://github.com/vchoutas/smplx) | [Bones Rigplay 1](https://bones.studio/datasets#rp01) | March 16, 2026 | [Link](https://huggingface.co/nvidia/Kimodo-SMPLX-RP-v1) | [NVIDIA R&D Model](https://www.nvidia.com/en-us/agreements/enterprise-software/nvidia-internal-scientific-research-and-development-model-license/) |
43
+
44
+ By default, we recommend using the models trained on the full Bones Rigplay 1 dataset (700 hours of mocap) for your motion generation needs.
45
+ The models trained on BONES-SEED use 288 hours of [publicly available mocap data](https://huggingface.co/datasets/bones-studio/seed) so are less capable, but are useful for comparing your own trained models on the same dataset. Soon, we will be releasing a benchmark to make it easy to compare motion generation models trained on BONES-SEED.
46
+
47
+ ## Getting Started
48
+
49
+ Please see the full documentation for detailed installation instructions, how to use the CLI and Interactive Demo, and other practical tips for generating motions with Kimodo:
50
+
51
+ **[Full Documentation](https://research.nvidia.com/labs/sil/projects/kimodo/docs)**
52
+ - [Quick Start Guide](https://research.nvidia.com/labs/sil/projects/kimodo/docs/getting_started/quick_start.html)
53
+ - [Installation Instructions](https://research.nvidia.com/labs/sil/projects/kimodo/docs/getting_started/installation.html)
54
+ - [Interactive Motion Authoring Demo](https://research.nvidia.com/labs/sil/projects/kimodo/docs/interactive_demo/index.html)
55
+ - [Command-Line Interface](https://research.nvidia.com/labs/sil/projects/kimodo/docs/user_guide/cli.html)
56
+ - [API Reference](https://research.nvidia.com/labs/sil/projects/kimodo/docs/api_reference/index.html)
57
+
58
+ Some notes on installation environment:
59
+ - Kimodo requires ~17GB of VRAM to generate locally, primarily due to the text embedding model
60
+ - The model has been most extensively tested on GeForce RTX 3090, GeForce RTX 4090, and NVIDIA A100 GPUs, but should work on other recent cards with sufficient VRAM
61
+ - This repo was developed on Linux, though Windows should work especially if using Docker
62
+
63
+ Before getting started with motion generation, please review the [best practices](https://research.nvidia.com/labs/sil/projects/kimodo/docs/key_concepts/limitations.html) and be aware of [model limitations](https://research.nvidia.com/labs/sil/projects/kimodo/docs/key_concepts/limitations.html#limitations).
64
+
65
+ ## Interactive Motion Authoring Demo
66
+
67
+ <div align="center">
68
+ <img src="assets/demo_screenshot.png" width="1000">
69
+ </div>
70
+
71
+ </br>
72
+
73
+ **[Demo Documentation and Tutorial](https://research.nvidia.com/labs/sil/projects/kimodo/docs/interactive_demo/index.html)**
74
+
75
+ The web-based interactive demo provides an intuitive interface for generating motions with any of the Kimodo model variations. After installation, the demo can be launched with the `kimodo_demo` command. It runs locally on http://127.0.0.1:7860. Open this URL in your browser to access the interface (or use port forwarding if set up on a server).
76
+
77
+ ### Demo Features
78
+ - **Multiple Characters**: Supports generating with the SOMA, G1, and SMPL-X versions of Kimodo
79
+ - **Text Prompts**: Enter one or more natural language descriptions of desired motions on the timeline
80
+ - **Timeline Editor**: Add and edit keyframes and constrained intervals on multiple constraint tracks
81
+ - **Constraint Types**:
82
+ - Full-Body: Complete joint position constraints at specific frames
83
+ - 2D Root: Define waypoints or full paths to follow on the ground plane
84
+ - End-Effectors: Control hands and feet positions/rotations
85
+ - **Constraint Editing**: Editing mode allows for re-posing of constraints or adjusting waypoints
86
+ - **3D Visualization**: Real-time rendering of generated motions with skeleton and skinned mesh options
87
+ - **Playback Controls**: Preview generated motions with adjustable playback speed
88
+ - **Multiple Samples**: Generate and compare multiple motion variations
89
+ - **Examples**: Load pre-existing examples to better understand Kimodo's capabilities
90
+ - **Export**: Save constraints and generated motions for later use
91
+
92
+ ## Command-Line Interface
93
+
94
+ **[CLI Documentation and Examples](https://research.nvidia.com/labs/sil/projects/kimodo/docs/user_guide/cli.html)**
95
+
96
+ Motions can also be generated directly from the command line with the `kimodo_gen` command or by running `python -m kimodo.scripts.generate` directly.
97
+
98
+ **Key Arguments:**
99
+ - `prompt`: A single text description or sequence of texts for the desired motion (required)
100
+ - `--model`: Which Kimodo model to use for generation
101
+ - `--duration`: Motion duration in seconds
102
+ - `--num_samples`: Number of motion variations to generate
103
+ - `--constraints`: Constraint file to control the generated motion (e.g., saved from the web demo)
104
+ - `--diffusion_steps`: Number of denoising steps
105
+ - `--cfg_type` / `--cfg_weight`: Classifier-free guidance (`nocfg`, `regular` with one weight, or `separated` with two weights for text vs. constraints); see the [CLI docs](https://research.nvidia.com/labs/sil/projects/kimodo/docs/user_guide/cli.html#classifier-free-guidance-cfg)
106
+ - `--no-postprocess`: Flag to disable foot skate and constraint cleanup post-processing
107
+ - `--seed`: Random seed for reproducible results
108
+
109
+ The script supports different output formats depending on which skeleton is used. By default, a custom NPZ format is saved that is compatible with the web demo.
110
+ For Kimodo-G1 models, the motion can be saved in the standard MuJoCo qpos CSV format.
111
+ For Kimodo-SMPLX, motion can be saved in the standard AMASS npz format for compability with existing pipelines.
112
+
113
+ ### Default NPZ Output Format
114
+ Generated motions are saved as NPZ files containing:
115
+ - `posed_joints`: Global joint positions `[T, J, 3]`
116
+ - `global_rot_mats`: Global joint rotation matrices `[T, J, 3, 3]`
117
+ - `local_rot_mats`: Local (parent-relative) joint rotation matrices `[T, J, 3, 3]`
118
+ - `foot_contacts`: Foot contact labels [left heel, left toe, right heel, right toes] `[T, 4]`
119
+ - `smooth_root_pos`: Smoothed root representations outputted from the model `[T, 3]`
120
+ - `root_positions`: The (non-smoothed) trajectory of the actual root joint (e.g., pelvis) `[T, 3]`
121
+ - `global_root_heading`: The heading direction output from the model `[T, 2]`
122
+
123
+ `T` the number of frames and `J` the number of joints.
124
+
125
+ ## Low-Level Python API
126
+
127
+ **[Model API Documentation](https://research.nvidia.com/labs/sil/projects/kimodo/docs/api_reference/model.html#kimodo.model.kimodo_model.Kimodo.__call__)**
128
+
129
+ For maximum flexibility, the low-level model inference API can be called directly, rather than going through our high-level CLI.
130
+ This allows for advanced model configuration including classifier-free guidance weights and parameters related to transitions in multi-prompt sequences.
131
+
132
+ ## Downstream Robotics Applications of Kimodo
133
+
134
+ ### Visualizing G1 Motions with MuJoCo
135
+
136
+ <div align="center">
137
+ <img src="assets/mujoco_result.gif" width="800">
138
+ </div>
139
+
140
+ After generating motions on the G1 robot skeleton and saving to the MuJoCo qpos CSV file format, they can be easily used and visualized within MuJoCo.
141
+ A minimal visualization script is available with:
142
+ ```
143
+ python -m kimodo.scripts.mujoco_load
144
+ ```
145
+ Make sure to edit the script to correctly point to your CSV file and install Mujoco before running this.
146
+
147
+ ### Tracking Generated Motions with ProtoMotions
148
+
149
+ <div align="center">
150
+ <img src="assets/protomotions_results.gif" width="1280">
151
+ </div>
152
 
153
+ [ProtoMotions](https://github.com/NVlabs/ProtoMotions) is a GPU-accelerated simulation and learning framework for training physically simulated digital humans and humanoid robots. The Kimodo NPZ and CSV output formats are both compatible with ProtoMotions making it easy to train physics-based policies with generated motions from Kimodo. ProtoMotions supports outputs on both the SOMA skeleton and Unitree G1
154
+
155
+ After generating motions with Kimodo, head over to the [ProtoMotions docs](https://github.com/NVlabs/ProtoMotions?tab=readme-ov-file#-motion-authoring-with-kimodo) to see how to import them.
156
+
157
+ ### Retargeting Motions to Other Robots with GMR
158
+
159
+ <div align="center">
160
+ <img src="assets/gmr_results.gif" width="1280">
161
+ </div>
162
+
163
+ Motions generated by Kimodo-SMPLX can be retargeted to other robots using [General Motion Retargeting (GMR)](https://github.com/YanjieZe/GMR).
164
+ GMR supports the AMASS NPZ format out of the box, so simply generate motions with Kimodo and use `--output` to save; the AMASS NPZ is written to `stem_amass.npz` (single sample) or in the output folder (multiple samples). Then, use the [SMPL-X to Robot script](https://github.com/YanjieZe/GMR?tab=readme-ov-file#retargeting-from-smpl-x-amass-omomo-to-robot) in GMR to retarget to any supported robot. For example:
165
+ ```
166
+ # run within GMR codebase
167
+ python scripts/smplx_to_robot.py --smplx_file /path/to/saved/amass_format.npz --robot booster_t1
168
+ ```
169
+
170
+ ## Timeline Annotations for BONES-SEED
171
+
172
+ As detailed in the [tech report](https://research.nvidia.com/labs/sil/projects/kimodo/assets/kimodo_tech_report.pdf), Kimodo is trained using fine-grained temporal text annotations of mocap clips.
173
+ While the full [Rigplay 1](https://bones.studio/datasets#rp01) dataset is proprietary, we have released the temporal segmentations for the public [BONES-SEED](https://huggingface.co/datasets/bones-studio/seed) subset.
174
+ These annotations are already included in the BONES-SEED dataset, but the standalone labels and additional information about them is [available on HuggingFace](https://huggingface.co/datasets/nvidia/SEED-Timeline-Annotations).
175
+
176
+ ## AMD Backend Orchestration (Hackathon Submission)
177
+
178
+ For hackathon submission workflows, this repository includes an AMD-oriented orchestration pack:
179
+
180
+ - Kubernetes manifests: [orchestration/amd/k8s](orchestration/amd/k8s)
181
+ - Slurm batch templates: [orchestration/amd/slurm](orchestration/amd/slurm)
182
+ - Deployment script: [orchestration/amd/deploy_k8s.sh](orchestration/amd/deploy_k8s.sh)
183
+ - Validation script: [orchestration/amd/validate_orchestration.sh](orchestration/amd/validate_orchestration.sh)
184
+ - Fireworks AMD planner helper: [orchestration/amd/fireworks_quickstart.sh](orchestration/amd/fireworks_quickstart.sh)
185
+
186
+ Quick start:
187
+
188
+ ```bash
189
+ bash orchestration/amd/deploy_k8s.sh
190
+ bash orchestration/amd/validate_orchestration.sh
191
+ ```
192
+
193
+ The AMD runtime path is controlled through `KIMODO_DEVICE=amd` (with strict/non-strict fallback support in the runtime health checks).
194
+
195
+ To route Qwen planning through Fireworks on AMD MI300X, run:
196
+
197
+ ```bash
198
+ export FIREWORKS_API_KEY=<your_key>
199
+ bash orchestration/amd/fireworks_quickstart.sh
200
+ FIREWORKS_VALIDATE_ONLY=false FIREWORKS_DEPLOYMENT_ID=kimodo-amd-planner bash orchestration/amd/fireworks_quickstart.sh
201
+
202
+ export KIMODO_PLANNER_PROVIDER=fireworks
203
+ export KIMODO_PLANNER_MODELS=accounts/<account-id>/deployments/<deployment-id>
204
+ ```
205
+
206
+ ## Deployment Matrix (Cloud Run)
207
+
208
+ Use these profiles as recommended defaults for rollout strictness.
209
+
210
+ | Profile | Goal | Recommended Flags | Notes |
211
+ |---|---|---|---|
212
+ | Public demo | Fast public access for hackathon/demo traffic | `PROJECT_ID=movimento-text-encoder REGION=europe-west1 ALLOW_UNAUTHENTICATED=true HF_SECRET_NAME=hf-token GPU_TYPE=nvidia-l4 GPU_COUNT=1 ./cloud-run/deploy.sh` | Enables `allUsers` invoker on encoder/demo services. |
213
+ | Private staging | Internal validation before public rollout | `PROJECT_ID=movimento-text-encoder REGION=europe-west1 ALLOW_UNAUTHENTICATED=false HF_SECRET_NAME=hf-token GPU_TYPE=nvidia-l4 GPU_COUNT=1 ./cloud-run/deploy.sh` | Keeps services private; run integration checks via authenticated callers only. |
214
+ | Stricter production | Controlled release with explicit policy + dependency gates | `PROJECT_ID=movimento-text-encoder REGION=europe-west1 ALLOW_UNAUTHENTICATED=false HF_SECRET_NAME=hf-token GPU_TYPE=nvidia-l4 GPU_COUNT=1 ./cloud-run/deploy.sh` | Health gate blocks downstream deploy when encoder is not ready; use IAM and network policy controls before enabling public traffic. |
215
+
216
+ Notes:
217
+ - For H200-style CUDA environments, keep the CUDA runtime path and set `GPU_TYPE` to the supported Cloud Run GPU type for your project/region.
218
+ - The deploy flow enforces secret placeholder substitution and encoder readiness before demo rollout.
219
+
220
+ ### Health Gate (Required)
221
+
222
+ Run the encoder health gate before downstream deployment:
223
+
224
+ ```bash
225
+ PROJECT_ID=movimento-text-encoder REGION=europe-west1 ./cloud-run/health_gate_text_encoder.sh
226
+ ```
227
+
228
+ If the result is `[PASS]`, proceed:
229
+
230
+ ```bash
231
+ PROJECT_ID=movimento-text-encoder REGION=europe-west1 ./cloud-run/deploy.sh
232
+ ```
233
+
234
+ ## Runtime Technology Stack
235
+
236
+ - Motion generation runtime: Kimodo diffusion models (PyTorch/CUDA)
237
+ - Planning layer: Qwen planner adapter (optional Fireworks deployment for AMD workflows)
238
+ - Model and artifact registry: Hugging Face Hub (token-gated model access for text encoder assets)
239
+ - Text encoding service: Cloud Run service `movimento-text-encoder`
240
+ - UI surfaces: Gradio Space frontend and Kimodo demo service
241
+ - Deployment targets in repo: Cloud Run (primary), HF Space integration, AMD orchestration manifests (Kubernetes/Slurm)
242
+
243
+
244
+ ## Related Humanoid Work at NVIDIA
245
+ Kimodo is part of a larger effort to enable humanoid motion data for robotics, physical AI, and other applications.
246
+
247
+ Check out these related works:
248
+ * [SOMA Body Model](https://github.com/NVlabs/SOMA-X) - a unified parameteric human body model
249
+ * [BONES-SEED Dataset](https://huggingface.co/datasets/bones-studio/seed) - a large scale human(oid) motion capture dataset in SOMA and G1 format
250
+ * [ProtoMotions](https://github.com/NVlabs/ProtoMotions) - simulation and learning framework for training physically simulated human(oid)s
251
+ * [SOMA Retargeter](https://github.com/NVIDIA/soma-retargeter) - SOMA to G1 retargeting tool
252
+ * [GEM](https://github.com/NVlabs/GEM-X) - human motion reconstruction from video
253
+ * [GEAR SONIC](https://github.com/NVlabs/GR00T-WholeBodyControl) - humanoid behavior foundation model for physical robots
254
+
255
+ ## Citation
256
+
257
+ If you use this code in your research, please cite:
258
+
259
+ ```bibtex
260
+ @article{Kimodo2026,
261
+ title={Kimodo: Scaling Controllable Human Motion Generation},
262
+ author={Rempe, Davis and Petrovich, Mathis and Yuan, Ye and Zhang, Haotian and Peng, Xue Bin and Jiang, Yifeng and Wang, Tingwu and Iqbal, Umar and Minor, David and de Ruyter, Michael and Li, Jiefeng and Tessler, Chen and Lim, Edy and Jeong, Eugene and Wu, Sam and Hassani, Ehsan and Huang, Michael and Yu, Jin-Bey and Chung, Chaeyeon and Song, Lina and Dionne, Olivier and Kautz, Jan and Yuen, Simon and Fidler, Sanja},
263
+ journal={arXiv:2603.15546},
264
+ year={2026}
265
+ }
266
+ ```
267
+
268
+ ## License
269
+
270
+ This codebase is licensed under [Apache-2.0](LICENSE). Note that model checkpoints and data are licensed separately as indicated on the HuggingFace download pages.
271
+
272
+ This project will download and install additional third-party open source software projects. Review the license terms of these open source projects before use.
273
+
274
+ ## Acknowledgments
275
+
276
+ This project builds upon excellent open-source projects:
277
+ - [Viser](https://github.com/nerfstudio-project/viser) for 3D motion authoring demo
278
+ - [LLM2Vec](https://github.com/McGill-NLP/llm2vec) for text encoding
279
+
280
+ ## Contact
281
+
282
+ For questions or issues, plese open an issue on this repository or reach out directly to the authors.
283
+
284
+ ---
cloud-run/demo.yaml ADDED
@@ -0,0 +1,68 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ apiVersion: serving.knative.dev/v1
2
+ kind: Service
3
+ metadata:
4
+ name: kimodo-demo
5
+ annotations:
6
+ run.googleapis.com/launch-stage: GA
7
+ spec:
8
+ template:
9
+ metadata:
10
+ annotations:
11
+ autoscaling.knative.dev/minScale: "1"
12
+ autoscaling.knative.dev/maxScale: "1"
13
+ run.googleapis.com/execution-environment: gen2
14
+ run.googleapis.com/gpu-type: GPU_TYPE_PLACEHOLDER
15
+ run.googleapis.com/gpu-zonal-redundancy-disabled: "true"
16
+ spec:
17
+ containerConcurrency: 1
18
+ timeoutSeconds: 3600
19
+ containers:
20
+ - image: REGION-docker.pkg.dev/PROJECT_ID/kimodo/kimodo:latest
21
+ command: ["python", "-m", "kimodo.demo"]
22
+ ports:
23
+ - containerPort: 7860
24
+ resources:
25
+ limits:
26
+ cpu: "8"
27
+ memory: 24Gi
28
+ nvidia.com/gpu: "GPU_COUNT_PLACEHOLDER"
29
+ env:
30
+ - name: SERVER_NAME
31
+ value: "0.0.0.0"
32
+ - name: TEXT_ENCODER_URL
33
+ value: TEXT_ENCODER_URL_PLACEHOLDER
34
+ - name: TEXT_ENCODER_MODE
35
+ value: "api"
36
+ - name: HF_MODE
37
+ value: "false"
38
+ - name: HF_HOME
39
+ value: /workspace/.cache/huggingface
40
+ - name: LOCAL_CACHE
41
+ value: "true"
42
+ - name: PYTHONUNBUFFERED
43
+ value: "1"
44
+ - name: HF_TOKEN
45
+ valueFrom:
46
+ secretKeyRef:
47
+ name: HF_TOKEN_SECRET_NAME
48
+ key: latest
49
+ - name: HUGGING_FACE_HUB_TOKEN
50
+ valueFrom:
51
+ secretKeyRef:
52
+ name: HF_TOKEN_SECRET_NAME
53
+ key: latest
54
+ - name: HF_HUB_TOKEN
55
+ valueFrom:
56
+ secretKeyRef:
57
+ name: HF_TOKEN_SECRET_NAME
58
+ key: latest
59
+ - name: HUGGINGFACEHUB_API_TOKEN
60
+ valueFrom:
61
+ secretKeyRef:
62
+ name: HF_TOKEN_SECRET_NAME
63
+ key: latest
64
+ - name: KIMODO_DEFER_MODEL_LOAD
65
+ value: "true"
66
+ traffic:
67
+ - percent: 100
68
+ latestRevision: true
cloud-run/deploy.sh ADDED
@@ -0,0 +1,152 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Deploy kimodo Cloud Run services.
3
+ # Usage:
4
+ # REGION=europe-west1 PROJECT_ID=my-project ./cloud-run/deploy.sh
5
+ # REGION=europe-west1 PROJECT_ID=my-project ALLOW_UNAUTHENTICATED=false ./cloud-run/deploy.sh
6
+ # REGION=europe-west1 PROJECT_ID=my-project GPU_TYPE=nvidia-h200-141gb GPU_COUNT=1 ./cloud-run/deploy.sh
7
+ set -euo pipefail
8
+
9
+ : "${REGION:?Set REGION (e.g. us-central1)}"
10
+ : "${PROJECT_ID:?Set PROJECT_ID}"
11
+ HF_SECRET_NAME="${HF_SECRET_NAME:-hf-token}"
12
+ ALLOW_UNAUTHENTICATED="${ALLOW_UNAUTHENTICATED:-true}"
13
+ GPU_TYPE="${GPU_TYPE:-nvidia-l4}"
14
+ GPU_COUNT="${GPU_COUNT:-1}"
15
+
16
+ IMAGE_TAG="$REGION-docker.pkg.dev/$PROJECT_ID/kimodo/kimodo:latest"
17
+ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
18
+ GCLOUD_BIN="${GCLOUD_BIN:-}"
19
+ if [[ -z "$GCLOUD_BIN" ]]; then
20
+ if command -v gcloud >/dev/null 2>&1; then
21
+ GCLOUD_BIN="$(command -v gcloud)"
22
+ elif [[ -x "/workspaces/kimodo/.tools/google-cloud-sdk/bin/gcloud" ]]; then
23
+ GCLOUD_BIN="/workspaces/kimodo/.tools/google-cloud-sdk/bin/gcloud"
24
+ else
25
+ echo "gcloud not found. Set GCLOUD_BIN or install gcloud CLI."
26
+ exit 1
27
+ fi
28
+ fi
29
+
30
+ # ── Substitute image path into manifests ────────────────────────────────────
31
+ IMAGE_DIGEST=$("$GCLOUD_BIN" artifacts docker images list "$REGION-docker.pkg.dev/$PROJECT_ID/kimodo/kimodo" \
32
+ --include-tags \
33
+ --filter="tags:latest" \
34
+ --format="value(version)" \
35
+ --limit=1)
36
+
37
+ if [[ -z "$IMAGE_DIGEST" ]]; then
38
+ echo "Could not resolve digest for image tag: $IMAGE_TAG"
39
+ exit 1
40
+ fi
41
+
42
+ IMAGE="$REGION-docker.pkg.dev/$PROJECT_ID/kimodo/kimodo@$IMAGE_DIGEST"
43
+ echo "Deploying image: $IMAGE"
44
+ echo "Auth policy (allUsers invoker): $ALLOW_UNAUTHENTICATED"
45
+ echo "GPU profile: type=$GPU_TYPE count=$GPU_COUNT"
46
+
47
+ sed \
48
+ -e "s|REGION-docker.pkg.dev/PROJECT_ID/kimodo/kimodo:latest|$IMAGE|g" \
49
+ -e "s|HF_TOKEN_SECRET_NAME|$HF_SECRET_NAME|g" \
50
+ -e "s|GPU_TYPE_PLACEHOLDER|$GPU_TYPE|g" \
51
+ -e "s|GPU_COUNT_PLACEHOLDER|$GPU_COUNT|g" \
52
+ "$SCRIPT_DIR/text-encoder.yaml" > /tmp/text-encoder-rendered.yaml
53
+
54
+ sed \
55
+ -e "s|REGION-docker.pkg.dev/PROJECT_ID/kimodo/kimodo:latest|$IMAGE|g" \
56
+ -e "s|HF_TOKEN_SECRET_NAME|$HF_SECRET_NAME|g" \
57
+ -e "s|GPU_TYPE_PLACEHOLDER|$GPU_TYPE|g" \
58
+ -e "s|GPU_COUNT_PLACEHOLDER|$GPU_COUNT|g" \
59
+ "$SCRIPT_DIR/demo.yaml" > /tmp/demo-rendered.yaml
60
+
61
+ if grep -q 'HF_TOKEN_SECRET_NAME' /tmp/text-encoder-rendered.yaml; then
62
+ echo "Secret placeholder HF_TOKEN_SECRET_NAME still present in rendered text-encoder manifest"
63
+ exit 1
64
+ fi
65
+
66
+ if grep -q 'GPU_TYPE_PLACEHOLDER\|GPU_COUNT_PLACEHOLDER' /tmp/text-encoder-rendered.yaml; then
67
+ echo "GPU placeholders still present in rendered text-encoder manifest"
68
+ exit 1
69
+ fi
70
+
71
+ if grep -q 'GPU_TYPE_PLACEHOLDER\|GPU_COUNT_PLACEHOLDER' /tmp/demo-rendered.yaml; then
72
+ echo "GPU placeholders still present in rendered demo manifest"
73
+ exit 1
74
+ fi
75
+
76
+ if grep -q 'HF_TOKEN_SECRET_NAME' /tmp/demo-rendered.yaml; then
77
+ echo "Secret placeholder HF_TOKEN_SECRET_NAME still present in rendered demo manifest"
78
+ exit 1
79
+ fi
80
+
81
+ # ── 1. Deploy text-encoder ───────────────────────────────────────────────────
82
+ echo "Deploying movimento-text-encoder to $REGION..."
83
+ "$GCLOUD_BIN" run services replace /tmp/text-encoder-rendered.yaml \
84
+ --region "$REGION" \
85
+ --project "$PROJECT_ID"
86
+
87
+ if [[ "$ALLOW_UNAUTHENTICATED" == "true" ]]; then
88
+ "$GCLOUD_BIN" run services add-iam-policy-binding movimento-text-encoder \
89
+ --region "$REGION" \
90
+ --project "$PROJECT_ID" \
91
+ --member "allUsers" \
92
+ --role "roles/run.invoker" 2>/dev/null || true
93
+ fi
94
+
95
+ TEXT_ENCODER_URL=$("$GCLOUD_BIN" run services describe movimento-text-encoder \
96
+ --region "$REGION" \
97
+ --project "$PROJECT_ID" \
98
+ --format "value(status.url)")
99
+
100
+ echo "Text-encoder URL: $TEXT_ENCODER_URL"
101
+
102
+ if [[ -z "$TEXT_ENCODER_URL" ]]; then
103
+ echo "Text encoder URL is empty. Blocking downstream deployment."
104
+ exit 1
105
+ fi
106
+
107
+ echo "Running encoder health gate before deploying downstream services..."
108
+ PROJECT_ID="$PROJECT_ID" \
109
+ REGION="$REGION" \
110
+ SERVICE_NAME="movimento-text-encoder" \
111
+ HF_SECRET_NAME="$HF_SECRET_NAME" \
112
+ DEMO_SERVICE_NAME="kimodo-demo" \
113
+ "$SCRIPT_DIR/health_gate_text_encoder.sh"
114
+
115
+ # ── 2. Inject text-encoder URL into demo manifest and deploy ─────────────────
116
+ sed -i "s|TEXT_ENCODER_URL_PLACEHOLDER|$TEXT_ENCODER_URL/|g" /tmp/demo-rendered.yaml
117
+
118
+ if grep -q 'TEXT_ENCODER_URL_PLACEHOLDER' /tmp/demo-rendered.yaml; then
119
+ echo "TEXT_ENCODER_URL_PLACEHOLDER still present in demo manifest"
120
+ exit 1
121
+ fi
122
+
123
+ echo "Deploying kimodo-demo to $REGION..."
124
+ "$GCLOUD_BIN" run services replace /tmp/demo-rendered.yaml \
125
+ --region "$REGION" \
126
+ --project "$PROJECT_ID"
127
+
128
+ if [[ "$ALLOW_UNAUTHENTICATED" == "true" ]]; then
129
+ "$GCLOUD_BIN" run services add-iam-policy-binding kimodo-demo \
130
+ --region "$REGION" \
131
+ --project "$PROJECT_ID" \
132
+ --member "allUsers" \
133
+ --role "roles/run.invoker" 2>/dev/null || true
134
+ fi
135
+
136
+ DEMO_URL=$("$GCLOUD_BIN" run services describe kimodo-demo \
137
+ --region "$REGION" \
138
+ --project "$PROJECT_ID" \
139
+ --format "value(status.url)")
140
+
141
+ echo "Re-running encoder health gate to verify dependency contract after demo deploy..."
142
+ PROJECT_ID="$PROJECT_ID" \
143
+ REGION="$REGION" \
144
+ SERVICE_NAME="movimento-text-encoder" \
145
+ HF_SECRET_NAME="$HF_SECRET_NAME" \
146
+ DEMO_SERVICE_NAME="kimodo-demo" \
147
+ "$SCRIPT_DIR/health_gate_text_encoder.sh"
148
+
149
+ echo ""
150
+ echo "✓ Deployment complete."
151
+ echo " Text-encoder: $TEXT_ENCODER_URL"
152
+ echo " Demo UI: $DEMO_URL"
cloud-run/deploy_text_encoder.sh ADDED
@@ -0,0 +1,95 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Deploy movimento text encoder service to Cloud Run.
3
+ # Usage:
4
+ # PROJECT_ID=movimento-text-encoder REGION=europe-west1 HF_TOKEN=hf_xxx ./cloud-run/deploy_text_encoder.sh
5
+ # PROJECT_ID=movimento-text-encoder REGION=europe-west1 HF_SECRET_NAME=hf-token ALLOW_UNAUTHENTICATED=false ./cloud-run/deploy_text_encoder.sh
6
+ # PROJECT_ID=movimento-text-encoder REGION=europe-west1 GPU_TYPE=nvidia-h200-141gb GPU_COUNT=1 ./cloud-run/deploy_text_encoder.sh
7
+ set -euo pipefail
8
+
9
+ : "${PROJECT_ID:?Set PROJECT_ID (e.g. movimento-text-encoder)}"
10
+ REGION="${REGION:-europe-west1}"
11
+ SERVICE_NAME="${SERVICE_NAME:-movimento-text-encoder}"
12
+ REPO_NAME="${REPO_NAME:-kimodo}"
13
+ IMAGE_NAME="${IMAGE_NAME:-kimodo}"
14
+ IMAGE_TAG="${IMAGE_TAG:-latest}"
15
+ HF_SECRET_NAME="${HF_SECRET_NAME:-hf-token}"
16
+ ALLOW_UNAUTHENTICATED="${ALLOW_UNAUTHENTICATED:-true}"
17
+ GPU_TYPE="${GPU_TYPE:-nvidia-l4}"
18
+ GPU_COUNT="${GPU_COUNT:-1}"
19
+
20
+ if ! command -v gcloud >/dev/null 2>&1; then
21
+ echo "gcloud CLI not found. Run this script from Cloud Shell or install gcloud."
22
+ exit 1
23
+ fi
24
+
25
+ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
26
+ REPO_ROOT="$(cd "$SCRIPT_DIR/.." && pwd)"
27
+ IMAGE_URI="$REGION-docker.pkg.dev/$PROJECT_ID/$REPO_NAME/$IMAGE_NAME:$IMAGE_TAG"
28
+
29
+ echo "[deploy] project=$PROJECT_ID region=$REGION service=$SERVICE_NAME image=$IMAGE_URI"
30
+ echo "[deploy] auth policy (allUsers invoker): $ALLOW_UNAUTHENTICATED"
31
+ echo "[deploy] gpu profile: type=$GPU_TYPE count=$GPU_COUNT"
32
+ gcloud config set project "$PROJECT_ID" >/dev/null
33
+
34
+ echo "[deploy] enabling required APIs"
35
+ gcloud services enable run.googleapis.com cloudbuild.googleapis.com artifactregistry.googleapis.com secretmanager.googleapis.com >/dev/null
36
+
37
+ if ! gcloud artifacts repositories describe "$REPO_NAME" --location="$REGION" >/dev/null 2>&1; then
38
+ echo "[deploy] creating Artifact Registry repo: $REPO_NAME"
39
+ gcloud artifacts repositories create "$REPO_NAME" \
40
+ --repository-format=docker \
41
+ --location="$REGION" \
42
+ --description="Movimento container images"
43
+ fi
44
+
45
+ if [[ -n "${HF_TOKEN:-}" ]]; then
46
+ if ! gcloud secrets describe "$HF_SECRET_NAME" >/dev/null 2>&1; then
47
+ echo "[deploy] creating secret: $HF_SECRET_NAME"
48
+ gcloud secrets create "$HF_SECRET_NAME" --replication-policy="automatic" >/dev/null
49
+ fi
50
+ echo "[deploy] updating secret version: $HF_SECRET_NAME"
51
+ printf '%s' "$HF_TOKEN" | gcloud secrets versions add "$HF_SECRET_NAME" --data-file=- >/dev/null
52
+ else
53
+ echo "[deploy] HF_TOKEN env var not set; expecting existing secret '$HF_SECRET_NAME'"
54
+ gcloud secrets describe "$HF_SECRET_NAME" >/dev/null
55
+ fi
56
+
57
+ PROJECT_NUMBER="$(gcloud projects describe "$PROJECT_ID" --format='value(projectNumber)')"
58
+ RUNTIME_SA="${PROJECT_NUMBER}-compute@developer.gserviceaccount.com"
59
+
60
+ echo "[deploy] granting Secret Manager access to runtime SA: $RUNTIME_SA"
61
+ gcloud secrets add-iam-policy-binding "$HF_SECRET_NAME" \
62
+ --member="serviceAccount:$RUNTIME_SA" \
63
+ --role="roles/secretmanager.secretAccessor" >/dev/null
64
+
65
+ echo "[deploy] building image: $IMAGE_URI"
66
+ gcloud builds submit "$REPO_ROOT" --config "$REPO_ROOT/cloudbuild.yaml" --substitutions="_IMAGE=$IMAGE_URI"
67
+
68
+ echo "[deploy] rendering Cloud Run manifest"
69
+ RENDERED_MANIFEST="/tmp/${SERVICE_NAME}-rendered.yaml"
70
+ sed \
71
+ -e "s|REGION-docker.pkg.dev/PROJECT_ID/kimodo/kimodo:latest|$IMAGE_URI|g" \
72
+ -e "s|HF_TOKEN_SECRET_NAME|$HF_SECRET_NAME|g" \
73
+ -e "s|GPU_TYPE_PLACEHOLDER|$GPU_TYPE|g" \
74
+ -e "s|GPU_COUNT_PLACEHOLDER|$GPU_COUNT|g" \
75
+ "$REPO_ROOT/cloud-run/text-encoder.yaml" > "$RENDERED_MANIFEST"
76
+
77
+ if grep -q 'HF_TOKEN_SECRET_NAME\|GPU_TYPE_PLACEHOLDER\|GPU_COUNT_PLACEHOLDER' "$RENDERED_MANIFEST"; then
78
+ echo "[deploy] rendered manifest still contains placeholders"
79
+ exit 1
80
+ fi
81
+
82
+ echo "[deploy] applying Cloud Run service"
83
+ gcloud run services replace "$RENDERED_MANIFEST" --region "$REGION" --project "$PROJECT_ID"
84
+
85
+ if [[ "$ALLOW_UNAUTHENTICATED" == "true" ]]; then
86
+ echo "[deploy] allowing unauthenticated invoke"
87
+ gcloud run services add-iam-policy-binding "$SERVICE_NAME" \
88
+ --region "$REGION" \
89
+ --project "$PROJECT_ID" \
90
+ --member "allUsers" \
91
+ --role "roles/run.invoker" >/dev/null || true
92
+ fi
93
+
94
+ SERVICE_URL="$(gcloud run services describe "$SERVICE_NAME" --region "$REGION" --project "$PROJECT_ID" --format 'value(status.url)')"
95
+ echo "[deploy] text encoder url: ${SERVICE_URL}/"
cloud-run/health_gate_text_encoder.sh ADDED
@@ -0,0 +1,165 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Cloud Run health gate for movimento-text-encoder.
3
+ # Usage:
4
+ # PROJECT_ID=my-project REGION=europe-west1 ./cloud-run/health_gate_text_encoder.sh
5
+ # PROJECT_ID=my-project REGION=europe-west1 SERVICE_NAME=movimento-text-encoder ./cloud-run/health_gate_text_encoder.sh
6
+
7
+ set -euo pipefail
8
+
9
+ : "${PROJECT_ID:?Set PROJECT_ID}"
10
+ REGION="${REGION:-europe-west1}"
11
+ SERVICE_NAME="${SERVICE_NAME:-movimento-text-encoder}"
12
+ DEMO_SERVICE_NAME="${DEMO_SERVICE_NAME:-kimodo-demo}"
13
+ HF_SECRET_NAME="${HF_SECRET_NAME:-hf-token}"
14
+ GATE_TIMEOUT_SEC="${GATE_TIMEOUT_SEC:-120}"
15
+ GATE_RETRY_INTERVAL_SEC="${GATE_RETRY_INTERVAL_SEC:-5}"
16
+
17
+ if ! command -v gcloud >/dev/null 2>&1; then
18
+ echo "FAIL: gcloud CLI not found"
19
+ exit 2
20
+ fi
21
+
22
+ ENCODER_JSON="$(gcloud run services describe "$SERVICE_NAME" \
23
+ --region "$REGION" \
24
+ --project "$PROJECT_ID" \
25
+ --format=json)"
26
+
27
+ readarray -t ENCODER_FIELDS < <(python - <<'PY' "$ENCODER_JSON" "$HF_SECRET_NAME"
28
+ import json
29
+ import sys
30
+
31
+ service = json.loads(sys.argv[1])
32
+ expected_secret = sys.argv[2]
33
+
34
+ conditions = service.get("status", {}).get("conditions", [])
35
+ ready = "Unknown"
36
+ for cond in conditions:
37
+ if cond.get("type") == "Ready":
38
+ ready = cond.get("status", "Unknown")
39
+ break
40
+
41
+ url = service.get("status", {}).get("url", "")
42
+ latest_ready = service.get("status", {}).get("latestReadyRevisionName", "")
43
+
44
+ traffic = service.get("status", {}).get("traffic", [])
45
+ latest_receives_traffic = "False"
46
+ for item in traffic:
47
+ if item.get("latestRevision") is True and int(item.get("percent", 0)) > 0:
48
+ latest_receives_traffic = "True"
49
+ break
50
+
51
+ spec_env = service.get("spec", {}).get("template", {}).get("spec", {}).get("containers", [{}])[0].get("env", [])
52
+ secret_names = []
53
+ for env in spec_env:
54
+ value_from = env.get("valueFrom") or {}
55
+ key_ref = value_from.get("secretKeyRef") or {}
56
+ name = key_ref.get("name")
57
+ if name:
58
+ secret_names.append(name)
59
+
60
+ secret_wiring = "PASS" if expected_secret in secret_names and "HF_TOKEN_SECRET_NAME" not in secret_names else "FAIL"
61
+
62
+ print(ready)
63
+ print(url)
64
+ print(latest_ready)
65
+ print(latest_receives_traffic)
66
+ print(secret_wiring)
67
+ PY
68
+ )
69
+
70
+ READY_STATUS="${ENCODER_FIELDS[0]}"
71
+ ENCODER_URL="${ENCODER_FIELDS[1]}"
72
+ LATEST_READY_REV="${ENCODER_FIELDS[2]}"
73
+ LATEST_TRAFFIC="${ENCODER_FIELDS[3]}"
74
+ SECRET_WIRING="${ENCODER_FIELDS[4]}"
75
+
76
+ if [[ -z "$ENCODER_URL" ]]; then
77
+ echo "Service Ready: ${READY_STATUS}"
78
+ echo "Revision Traffic: ${LATEST_TRAFFIC}"
79
+ echo "Encoder URL Check: FAIL (missing URL)"
80
+ echo "Secret Wiring: ${SECRET_WIRING}"
81
+ echo "Failure Logs: FAIL"
82
+ echo "Dependency Contract: FAIL"
83
+ echo "[FAIL] Encoder service URL is empty"
84
+ exit 1
85
+ fi
86
+
87
+ deadline=$((SECONDS + GATE_TIMEOUT_SEC))
88
+ endpoint_ok="false"
89
+ latency_ms=""
90
+
91
+ while (( SECONDS < deadline )); do
92
+ if latency_ms=$(python - <<'PY' "$ENCODER_URL"
93
+ import sys
94
+ import time
95
+ import urllib.request
96
+
97
+ url = sys.argv[1]
98
+ start = time.time()
99
+ with urllib.request.urlopen(url, timeout=10) as resp:
100
+ if resp.status < 500:
101
+ elapsed = int((time.time() - start) * 1000)
102
+ print(elapsed)
103
+ else:
104
+ raise RuntimeError(f"status={resp.status}")
105
+ PY
106
+ 2>/dev/null); then
107
+ endpoint_ok="true"
108
+ break
109
+ fi
110
+ sleep "$GATE_RETRY_INTERVAL_SEC"
111
+ done
112
+
113
+ demo_contract="SKIPPED"
114
+ if gcloud run services describe "$DEMO_SERVICE_NAME" --region "$REGION" --project "$PROJECT_ID" >/dev/null 2>&1; then
115
+ DEMO_JSON="$(gcloud run services describe "$DEMO_SERVICE_NAME" --region "$REGION" --project "$PROJECT_ID" --format=json)"
116
+ demo_url_match=$(python - <<'PY' "$DEMO_JSON" "$ENCODER_URL"
117
+ import json
118
+ import sys
119
+
120
+ service = json.loads(sys.argv[1])
121
+ encoder_url = sys.argv[2].rstrip('/') + '/'
122
+
123
+ envs = service.get("spec", {}).get("template", {}).get("spec", {}).get("containers", [{}])[0].get("env", [])
124
+ configured = None
125
+ for env in envs:
126
+ if env.get("name") == "TEXT_ENCODER_URL":
127
+ configured = (env.get("value") or "").rstrip('/') + '/'
128
+ break
129
+
130
+ if configured == encoder_url:
131
+ print("PASS")
132
+ else:
133
+ print("FAIL")
134
+ PY
135
+ )
136
+ demo_contract="$demo_url_match"
137
+ fi
138
+
139
+ echo "Service Ready: ${READY_STATUS}"
140
+ echo "Latest Ready Revision: ${LATEST_READY_REV}"
141
+ echo "Revision Traffic: ${LATEST_TRAFFIC}"
142
+ if [[ "$endpoint_ok" == "true" ]]; then
143
+ echo "Encoder URL Check: PASS (${latency_ms}ms)"
144
+ else
145
+ echo "Encoder URL Check: FAIL (timeout after ${GATE_TIMEOUT_SEC}s)"
146
+ fi
147
+ echo "Secret Wiring: ${SECRET_WIRING}"
148
+ if [[ "$READY_STATUS" == "True" && "$LATEST_TRAFFIC" == "True" ]]; then
149
+ echo "Failure Logs: PASS"
150
+ else
151
+ echo "Failure Logs: FAIL"
152
+ fi
153
+ echo "Dependency Contract: ${demo_contract}"
154
+
155
+ if [[ "$READY_STATUS" != "True" || "$LATEST_TRAFFIC" != "True" || "$endpoint_ok" != "true" || "$SECRET_WIRING" != "PASS" ]]; then
156
+ echo "[FAIL] Cloud Run encoder health gate failed"
157
+ exit 1
158
+ fi
159
+
160
+ if [[ "$demo_contract" == "FAIL" ]]; then
161
+ echo "[FAIL] Demo TEXT_ENCODER_URL does not match encoder URL"
162
+ exit 1
163
+ fi
164
+
165
+ echo "[PASS] Cloud Run encoder health gate passed"
cloud-run/hf_sync_filters.txt ADDED
@@ -0,0 +1,15 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ exclude .git/**
2
+ exclude .pytest_cache/**
3
+ exclude .mypy_cache/**
4
+ exclude .ruff_cache/**
5
+ exclude .nox/**
6
+ exclude .tox/**
7
+ exclude .venv/**
8
+ exclude .tools/**
9
+ exclude **/__pycache__/**
10
+ exclude **/*.pyc
11
+ exclude kimodo.egg-info/**
12
+ exclude docs/_build/**
13
+ exclude dist/**
14
+ exclude *.log
15
+ exclude nohup.out
cloud-run/sync_hf_bucket.sh ADDED
@@ -0,0 +1,149 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #!/usr/bin/env bash
2
+ # Sync local repository content to a Hugging Face bucket.
3
+ #
4
+ # Usage examples:
5
+ # HF_TOKEN=hf_xxx ./cloud-run/sync_hf_bucket.sh
6
+ # ./cloud-run/sync_hf_bucket.sh --source ./kimodo --dry-run
7
+ # ./cloud-run/sync_hf_bucket.sh --dest hf://buckets/rydlrKE/movimento-bucket --include-build
8
+
9
+ set -euo pipefail
10
+
11
+ SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
12
+ DEST="hf://buckets/rydlrKE/movimento-bucket/kimodo"
13
+ SOURCE="."
14
+ DRY_RUN=false
15
+ INCLUDE_BUILD=false
16
+ VERBOSE=false
17
+ DELETE_MISSING=false
18
+ FILTER_FILE="$SCRIPT_DIR/hf_sync_filters.txt"
19
+
20
+ while [[ $# -gt 0 ]]; do
21
+ case "$1" in
22
+ --dest)
23
+ DEST="$2"
24
+ shift 2
25
+ ;;
26
+ --source)
27
+ SOURCE="$2"
28
+ shift 2
29
+ ;;
30
+ --dry-run)
31
+ DRY_RUN=true
32
+ shift
33
+ ;;
34
+ --include-build)
35
+ INCLUDE_BUILD=true
36
+ shift
37
+ ;;
38
+ --verbose)
39
+ VERBOSE=true
40
+ shift
41
+ ;;
42
+ --delete)
43
+ DELETE_MISSING=true
44
+ shift
45
+ ;;
46
+ --filter-file)
47
+ FILTER_FILE="$2"
48
+ shift 2
49
+ ;;
50
+ -h|--help)
51
+ cat <<'EOF'
52
+ Sync local files to a Hugging Face bucket.
53
+
54
+ Options:
55
+ --source <path> Local source directory (default: .)
56
+ --dest <hf://...> HF bucket destination (default: hf://buckets/rydlrKE/movimento-bucket)
57
+ --dry-run Print planned actions without uploading
58
+ --include-build Include build/ artifacts (excluded by default)
59
+ --verbose Enable verbose sync output
60
+ --delete Delete destination files missing from source
61
+ --filter-file <path> Use custom include/exclude filter file
62
+ -h, --help Show this help
63
+ EOF
64
+ exit 0
65
+ ;;
66
+ *)
67
+ echo "Unknown argument: $1" >&2
68
+ exit 1
69
+ ;;
70
+ esac
71
+ done
72
+
73
+ if [[ ! -d "$SOURCE" ]]; then
74
+ echo "Source directory not found: $SOURCE" >&2
75
+ exit 1
76
+ fi
77
+
78
+ if ! command -v hf >/dev/null 2>&1; then
79
+ echo "Installing Hugging Face CLI..."
80
+ curl -LsSf https://hf.co/cli/install.sh | bash
81
+ fi
82
+
83
+ echo "Using HF CLI: $(command -v hf)"
84
+ hf --version
85
+
86
+ TOKEN="${HF_TOKEN:-${HUGGING_FACE_HUB_TOKEN:-${HF_HUB_TOKEN:-${HUGGINGFACEHUB_API_TOKEN:-}}}}"
87
+ SYNC_ARGS=()
88
+
89
+ if [[ -n "$TOKEN" ]]; then
90
+ SYNC_ARGS+=(--token "$TOKEN")
91
+ else
92
+ if ! hf auth whoami >/dev/null 2>&1; then
93
+ echo "No valid HF authentication found." >&2
94
+ echo "Set HF_TOKEN (or compatible HF token env var), or run: hf auth login --force" >&2
95
+ exit 1
96
+ fi
97
+ fi
98
+
99
+ SYNC_ARGS+=(
100
+ --exclude ".git/**"
101
+ --exclude ".pytest_cache/**"
102
+ --exclude ".mypy_cache/**"
103
+ --exclude ".ruff_cache/**"
104
+ --exclude ".nox/**"
105
+ --exclude ".tox/**"
106
+ --exclude ".venv/**"
107
+ --exclude ".tools/**"
108
+ --exclude "__pycache__/**"
109
+ --exclude "*/__pycache__/**"
110
+ --exclude "**/__pycache__/**"
111
+ --exclude "*.pyc"
112
+ --exclude "**/*.pyc"
113
+ --exclude "kimodo.egg-info/**"
114
+ --exclude "dist/**"
115
+ --exclude "docs/_build/**"
116
+ )
117
+
118
+ if [[ -n "$FILTER_FILE" ]]; then
119
+ if [[ ! -f "$FILTER_FILE" ]]; then
120
+ echo "Filter file not found: $FILTER_FILE" >&2
121
+ exit 1
122
+ fi
123
+ SYNC_ARGS+=(--filter-from "$FILTER_FILE")
124
+ fi
125
+
126
+ if [[ "$INCLUDE_BUILD" == "false" ]]; then
127
+ SYNC_ARGS+=(--exclude "build/**")
128
+ fi
129
+
130
+ if [[ "$DELETE_MISSING" == "true" ]]; then
131
+ SYNC_ARGS+=(--delete)
132
+ fi
133
+
134
+ if [[ "$DRY_RUN" == "true" ]]; then
135
+ SYNC_ARGS+=(--dry-run)
136
+ fi
137
+
138
+ if [[ "$VERBOSE" == "true" ]]; then
139
+ SYNC_ARGS+=(--verbose)
140
+ fi
141
+
142
+ echo "Syncing source: $SOURCE"
143
+ echo "Syncing destination: $DEST"
144
+ if [[ -n "$FILTER_FILE" ]]; then
145
+ echo "Using filter file: $FILTER_FILE"
146
+ fi
147
+ hf sync "$SOURCE" "$DEST" "${SYNC_ARGS[@]}"
148
+
149
+ echo "Sync complete."
cloud-run/text-encoder.yaml ADDED
@@ -0,0 +1,62 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ apiVersion: serving.knative.dev/v1
2
+ kind: Service
3
+ metadata:
4
+ name: movimento-text-encoder
5
+ annotations:
6
+ run.googleapis.com/launch-stage: GA
7
+ spec:
8
+ template:
9
+ metadata:
10
+ annotations:
11
+ autoscaling.knative.dev/minScale: "1"
12
+ autoscaling.knative.dev/maxScale: "1"
13
+ run.googleapis.com/execution-environment: gen2
14
+ run.googleapis.com/gpu-type: GPU_TYPE_PLACEHOLDER
15
+ run.googleapis.com/gpu-zonal-redundancy-disabled: "true"
16
+ spec:
17
+ containerConcurrency: 1
18
+ timeoutSeconds: 900
19
+ containers:
20
+ - image: REGION-docker.pkg.dev/PROJECT_ID/kimodo/kimodo:latest
21
+ command: ["python", "-m", "kimodo.scripts.run_text_encoder_server"]
22
+ ports:
23
+ - containerPort: 9550
24
+ resources:
25
+ limits:
26
+ cpu: "8"
27
+ memory: 24Gi
28
+ nvidia.com/gpu: "GPU_COUNT_PLACEHOLDER"
29
+ env:
30
+ - name: GRADIO_SERVER_NAME
31
+ value: "0.0.0.0"
32
+ - name: TEXT_ENCODER
33
+ value: "llm2vec"
34
+ - name: LOCAL_CACHE
35
+ value: "true"
36
+ - name: HF_HOME
37
+ value: /workspace/.cache/huggingface
38
+ - name: PYTHONUNBUFFERED
39
+ value: "1"
40
+ - name: HF_TOKEN
41
+ valueFrom:
42
+ secretKeyRef:
43
+ name: HF_TOKEN_SECRET_NAME
44
+ key: latest
45
+ - name: HUGGING_FACE_HUB_TOKEN
46
+ valueFrom:
47
+ secretKeyRef:
48
+ name: HF_TOKEN_SECRET_NAME
49
+ key: latest
50
+ - name: HF_HUB_TOKEN
51
+ valueFrom:
52
+ secretKeyRef:
53
+ name: HF_TOKEN_SECRET_NAME
54
+ key: latest
55
+ - name: HUGGINGFACEHUB_API_TOKEN
56
+ valueFrom:
57
+ secretKeyRef:
58
+ name: HF_TOKEN_SECRET_NAME
59
+ key: latest
60
+ traffic:
61
+ - percent: 100
62
+ latestRevision: true
docker_requirements.in ADDED
@@ -0,0 +1,49 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ #
2
+ # Human-maintained direct dependencies (top-level).
3
+ # Use `uv` to compile this into a fully pinned `requirements.txt` lockfile.
4
+ #
5
+ # IMPORTANT:
6
+ # - We intentionally do NOT list `torch` here because the Docker image base
7
+ # (`nvcr.io/nvidia/pytorch`) already provides it. Installing torch via pip
8
+ # during image build is slow and can lead to ABI/CUDA mismatches.
9
+ # - If you are NOT using Docker, install an appropriate PyTorch build separately.
10
+ #
11
+
12
+ # Config / wiring
13
+ hydra-core>=1.3
14
+ omegaconf>=2.3
15
+
16
+ # Core numerics
17
+ numpy>=1.23,<2
18
+ scipy>=1.10,<2
19
+
20
+ # Model / embeddings
21
+ # NOTE: `kimodo/model/llm2vec` is has only been tested with transformers==5.1.0
22
+ transformers==5.1.0
23
+ urllib3>=2.6.3
24
+ boto3
25
+ peft>=0.12
26
+ einops>=0.7
27
+
28
+ # Misc
29
+ tqdm>=4.0
30
+ packaging>=21.0
31
+ pydantic>=2.0
32
+
33
+ # UI / client
34
+ filelock>=3.20.3
35
+ gradio>=6.8.0
36
+ gradio_client>=1.0
37
+
38
+ # Visualization
39
+ trimesh>=3.21.7
40
+ scenepic>=1.1.0
41
+ pillow>=9.0
42
+ av>=16.1.0
43
+
44
+ py-soma-x @ git+https://github.com/NVlabs/SOMA-X.git
45
+
46
+ # Local packages (editable installs for viser and kimodo; MotionCorrection non-editable)
47
+ ./MotionCorrection
48
+ -e .
49
+ viser @ git+https://github.com/nv-tlabs/kimodo-viser.git
docker_requirements.txt ADDED
@@ -0,0 +1,377 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # This file was autogenerated by uv via the following command:
2
+ # NOTE: `torch` (and its CUDA wheels) are intentionally omitted from this lockfile.
3
+ # The Docker base image (nvcr.io/nvidia/pytorch) already provides a tested PyTorch build.
4
+ #
5
+ # uv pip compile docker_requirements.in -o docker_requirements.txt --python-version 3.10 --python-platform x86_64-manylinux2014
6
+ -e .
7
+ # via -r docker_requirements.in
8
+ viser @ git+https://github.com/nv-tlabs/kimodo-viser.git
9
+ # via -r docker_requirements.in
10
+ py-soma-x @ git+https://github.com/NVlabs/SOMA-X.git
11
+ # via -r docker_requirements.in
12
+ accelerate==1.13.0
13
+ # via peft
14
+ aiofiles==24.1.0
15
+ # via gradio
16
+ annotated-doc==0.0.4
17
+ # via
18
+ # fastapi
19
+ # typer
20
+ annotated-types==0.7.0
21
+ # via pydantic
22
+ antlr4-python3-runtime==4.9.3
23
+ # via
24
+ # hydra-core
25
+ # omegaconf
26
+ anyio==4.12.1
27
+ # via
28
+ # gradio
29
+ # httpx
30
+ # starlette
31
+ attrs==25.4.0
32
+ # via
33
+ # jsonschema
34
+ # referencing
35
+ av==16.1.0
36
+ # via
37
+ # -r docker_requirements.in
38
+ # kimodo
39
+ boto3==1.42.66
40
+ # via
41
+ # -r docker_requirements.in
42
+ # kimodo
43
+ botocore==1.42.66
44
+ # via
45
+ # boto3
46
+ # s3transfer
47
+ brotli==1.2.0
48
+ # via gradio
49
+ certifi==2026.2.25
50
+ # via
51
+ # httpcore
52
+ # httpx
53
+ # requests
54
+ charset-normalizer==3.4.5
55
+ # via
56
+ # requests
57
+ # trimesh
58
+ click==8.3.1
59
+ # via
60
+ # typer
61
+ # uvicorn
62
+ colorlog==6.10.1
63
+ # via trimesh
64
+ einops==0.8.2
65
+ # via
66
+ # -r docker_requirements.in
67
+ # kimodo
68
+ embreex==2.17.7.post7
69
+ # via trimesh
70
+ exceptiongroup==1.3.1
71
+ # via anyio
72
+ fastapi==0.135.1
73
+ # via gradio
74
+ ffmpy==1.0.0
75
+ # via gradio
76
+ filelock==3.25.2
77
+ # via
78
+ # -r docker_requirements.in
79
+ # huggingface-hub
80
+ # kimodo
81
+ # torch
82
+ fsspec==2026.2.0
83
+ # via
84
+ # gradio-client
85
+ # huggingface-hub
86
+ # torch
87
+ gradio==6.9.0
88
+ # via
89
+ # -r docker_requirements.in
90
+ # kimodo
91
+ gradio-client==2.3.0
92
+ # via
93
+ # -r docker_requirements.in
94
+ # gradio
95
+ # kimodo
96
+ groovy==0.1.2
97
+ # via gradio
98
+ h11==0.16.0
99
+ # via
100
+ # httpcore
101
+ # uvicorn
102
+ hf-xet==1.4.0
103
+ # via huggingface-hub
104
+ httpcore==1.0.9
105
+ # via httpx
106
+ httpx==0.28.1
107
+ # via
108
+ # gradio
109
+ # gradio-client
110
+ # huggingface-hub
111
+ # safehttpx
112
+ # trimesh
113
+ huggingface-hub==1.6.0
114
+ # via
115
+ # accelerate
116
+ # gradio
117
+ # gradio-client
118
+ # peft
119
+ # tokenizers
120
+ # transformers
121
+ hydra-core==1.3.2
122
+ # via
123
+ # -r docker_requirements.in
124
+ # kimodo
125
+ idna==3.11
126
+ # via
127
+ # anyio
128
+ # httpx
129
+ # requests
130
+ imageio==2.37.3
131
+ # via viser
132
+ jinja2==3.1.6
133
+ # via
134
+ # gradio
135
+ # torch
136
+ jmespath==1.1.0
137
+ # via
138
+ # boto3
139
+ # botocore
140
+ jsonschema==4.26.0
141
+ # via trimesh
142
+ jsonschema-specifications==2025.9.1
143
+ # via jsonschema
144
+ lxml==6.0.2
145
+ # via
146
+ # trimesh
147
+ # yourdfpy
148
+ manifold3d==3.4.0
149
+ # via trimesh
150
+ mapbox-earcut==2.0.0
151
+ # via trimesh
152
+ markdown-it-py==4.0.0
153
+ # via rich
154
+ markupsafe==3.0.3
155
+ # via
156
+ # gradio
157
+ # jinja2
158
+ mdurl==0.1.2
159
+ # via markdown-it-py
160
+ ./MotionCorrection
161
+ # via -r docker_requirements.in
162
+ msgspec==0.20.0
163
+ # via viser
164
+ nodeenv==1.10.0
165
+ # via viser
166
+ numpy==1.26.4
167
+ # via
168
+ # -r docker_requirements.in
169
+ # accelerate
170
+ # embreex
171
+ # gradio
172
+ # imageio
173
+ # kimodo
174
+ # manifold3d
175
+ # mapbox-earcut
176
+ # motion-correction
177
+ # pandas
178
+ # peft
179
+ # pycollada
180
+ # scenepic
181
+ # scipy
182
+ # shapely
183
+ # transformers
184
+ # trimesh
185
+ # vhacdx
186
+ # viser
187
+ # yourdfpy
188
+ omegaconf==2.3.0
189
+ # via
190
+ # -r docker_requirements.in
191
+ # hydra-core
192
+ # kimodo
193
+ orjson==3.11.7
194
+ # via gradio
195
+ packaging==26.0
196
+ # via
197
+ # -r docker_requirements.in
198
+ # accelerate
199
+ # gradio
200
+ # gradio-client
201
+ # huggingface-hub
202
+ # hydra-core
203
+ # kimodo
204
+ # peft
205
+ # transformers
206
+ pandas==2.3.3
207
+ # via gradio
208
+ peft==0.18.1
209
+ # via
210
+ # -r docker_requirements.in
211
+ # kimodo
212
+ pillow==12.1.1
213
+ # via
214
+ # -r docker_requirements.in
215
+ # gradio
216
+ # imageio
217
+ # kimodo
218
+ # scenepic
219
+ # trimesh
220
+ psutil==7.2.2
221
+ # via
222
+ # accelerate
223
+ # peft
224
+ pycollada==0.9.3
225
+ # via trimesh
226
+ pydantic==2.12.5
227
+ # via
228
+ # -r docker_requirements.in
229
+ # fastapi
230
+ # gradio
231
+ # kimodo
232
+ pydantic-core==2.41.5
233
+ # via pydantic
234
+ pydub==0.25.1
235
+ # via gradio
236
+ pygments==2.19.2
237
+ # via rich
238
+ python-dateutil==2.9.0.post0
239
+ # via
240
+ # botocore
241
+ # pandas
242
+ # pycollada
243
+ python-multipart==0.0.22
244
+ # via gradio
245
+ pytz==2026.1.post1
246
+ # via
247
+ # gradio
248
+ # pandas
249
+ pyyaml==6.0.3
250
+ # via
251
+ # accelerate
252
+ # gradio
253
+ # huggingface-hub
254
+ # omegaconf
255
+ # peft
256
+ # transformers
257
+ referencing==0.37.0
258
+ # via
259
+ # jsonschema
260
+ # jsonschema-specifications
261
+ regex==2026.2.28
262
+ # via transformers
263
+ requests==2.32.5
264
+ # via viser
265
+ rich==14.3.3
266
+ # via
267
+ # typer
268
+ # viser
269
+ rpds-py==0.30.0
270
+ # via
271
+ # jsonschema
272
+ # referencing
273
+ rtree==1.4.1
274
+ # via trimesh
275
+ s3transfer==0.16.0
276
+ # via boto3
277
+ safehttpx==0.1.7
278
+ # via gradio
279
+ safetensors==0.7.0
280
+ # via
281
+ # accelerate
282
+ # peft
283
+ # transformers
284
+ scenepic==1.1.2
285
+ # via
286
+ # -r docker_requirements.in
287
+ # kimodo
288
+ scipy==1.15.3
289
+ # via
290
+ # -r docker_requirements.in
291
+ # kimodo
292
+ # scenepic
293
+ # trimesh
294
+ semantic-version==2.10.0
295
+ # via gradio
296
+ shapely==2.1.2
297
+ # via trimesh
298
+ shellingham==1.5.4
299
+ # via typer
300
+ six==1.17.0
301
+ # via
302
+ # python-dateutil
303
+ # yourdfpy
304
+ starlette==0.52.1
305
+ # via
306
+ # fastapi
307
+ # gradio
308
+ svg-path==7.0
309
+ # via trimesh
310
+ tokenizers==0.22.2
311
+ # via transformers
312
+ tomlkit==0.13.3
313
+ # via gradio
314
+ tqdm==4.67.3
315
+ # via
316
+ # -r docker_requirements.in
317
+ # huggingface-hub
318
+ # kimodo
319
+ # peft
320
+ # transformers
321
+ # viser
322
+ transformers==5.1.0
323
+ # via
324
+ # -r docker_requirements.in
325
+ # kimodo
326
+ # peft
327
+ trimesh==4.11.3
328
+ # via
329
+ # -r docker_requirements.in
330
+ # kimodo
331
+ # viser
332
+ # yourdfpy
333
+ typer==0.24.1
334
+ # via
335
+ # gradio
336
+ # huggingface-hub
337
+ # typer-slim
338
+ typer-slim==0.24.0
339
+ # via transformers
340
+ typing-extensions==4.15.0
341
+ # via
342
+ # anyio
343
+ # exceptiongroup
344
+ # fastapi
345
+ # gradio
346
+ # gradio-client
347
+ # huggingface-hub
348
+ # pydantic
349
+ # pydantic-core
350
+ # referencing
351
+ # starlette
352
+ # torch
353
+ # typing-inspection
354
+ # uvicorn
355
+ # viser
356
+ typing-inspection==0.4.2
357
+ # via
358
+ # fastapi
359
+ # pydantic
360
+ tzdata==2025.3
361
+ # via pandas
362
+ urllib3==2.6.3
363
+ # via
364
+ # -r docker_requirements.in
365
+ # botocore
366
+ # kimodo
367
+ # requests
368
+ uvicorn==0.41.0
369
+ # via gradio
370
+ vhacdx==0.0.10
371
+ # via trimesh
372
+ websockets==15.0.1
373
+ # via viser
374
+ xxhash==3.6.0
375
+ # via trimesh
376
+ yourdfpy==0.0.60
377
+ # via viser
kimodo/demo/app.py CHANGED
@@ -61,8 +61,17 @@ class Demo:
61
  if resolved not in MODEL_NAMES:
62
  raise ValueError(f"Unknown model '{default_model_name}'. Expected one of: {MODEL_NAMES}")
63
  self.default_model_name = resolved
 
 
 
 
 
 
64
  self.ensure_examples_layout()
65
- self.load_model(self.default_model_name)
 
 
 
66
 
67
  # Serialize GPU-bound generation across all clients
68
  self._generation_lock = threading.Lock()
 
61
  if resolved not in MODEL_NAMES:
62
  raise ValueError(f"Unknown model '{default_model_name}'. Expected one of: {MODEL_NAMES}")
63
  self.default_model_name = resolved
64
+ self.defer_model_load = os.getenv("KIMODO_DEFER_MODEL_LOAD", "true").strip().lower() in {
65
+ "1",
66
+ "true",
67
+ "yes",
68
+ "on",
69
+ }
70
  self.ensure_examples_layout()
71
+ if self.defer_model_load:
72
+ print("Deferring model load until first active client session.")
73
+ else:
74
+ self.load_model(self.default_model_name)
75
 
76
  # Serialize GPU-bound generation across all clients
77
  self._generation_lock = threading.Lock()