Z-Image-Turbo ONNX (Browser-Oriented Sharded Packaging)
Browser-oriented ONNX packaging of Tongyi-MAI/Z-Image-Turbo, an S3-DiT image generation model. This bundle keeps the tokenizer, text encoder, VAE, and scheduler in ONNX form and replaces the monolithic transformer with a sharded transformer layout intended for constrained browser and mobile WebGPU environments.
The packaging work here also leaned on the WebNN team's public Z-Image ONNX artifacts as a practical reference point for browser inference: webnn/Z-Image-Turbo.
What Changed
- The WebNN-style transformer packaging was first reordered into a cleaner, more contiguous execution layout. In practice this acts like a "defragmented" transformer graph.
- That reordered transformer was then split into multiple ONNX shards.
- The goal was to reduce peak per-session model size for browser runtimes, especially on tighter mobile GPU memory budgets.
This is still the same underlying Z-Image-Turbo model family and ONNX-based browser inference stack. The main packaging differences are the reordered monolithic transformer and the sharded transformer layout built from it.
Repository Layout
tokenizer/: tokenizer files used to turn prompt text into model inputsonnx/text_encoder_model_q4f16.onnx+.onnx_data: text encoder that converts tokenized prompts into conditioning embeddingsonnx/vae_decoder_model_f16.onnx: VAE decoder that turns final latents into pixelsonnx/vae_pre_process_model_f16.onnx: VAE helper used in the decode pathonnx/scheduler_step_model_f16.onnx: scheduler update model used between denoising steps to produce the next latent sampleonnx/transformer_model_q4f16.onnx+.onnx_data: reordered / defragmented monolithic transformer kept as the base form used to build the shardsonnx/transformer_model_q4f16_shard*.onnx+ matching.onnx_data: the main transformer split from the reordered monolithic form into multiple shards to reduce peak session sizeonnx/transformer_shards.json: manifest describing the shard set
Notes
- Base model license and usage terms come from the upstream
Tongyi-MAI/Z-Image-Turborelease. - This repo focuses on browser-friendly packaging rather than training or architecture changes.
Model tree for cretz/Z-Image-Turbo-ONNX-sharded
Base model
Tongyi-MAI/Z-Image-Turbo