Z-Image-Turbo ONNX (Browser-Oriented Sharded Packaging)

Browser-oriented ONNX packaging of Tongyi-MAI/Z-Image-Turbo, an S3-DiT image generation model. This bundle keeps the tokenizer, text encoder, VAE, and scheduler in ONNX form and replaces the monolithic transformer with a sharded transformer layout intended for constrained browser and mobile WebGPU environments.

The packaging work here also leaned on the WebNN team's public Z-Image ONNX artifacts as a practical reference point for browser inference: webnn/Z-Image-Turbo.

What Changed

The WebNN-style transformer packaging was first reordered into a cleaner, more contiguous execution layout. In practice this acts like a "defragmented" transformer graph.
That reordered transformer was then split into multiple ONNX shards.
The goal was to reduce peak per-session model size for browser runtimes, especially on tighter mobile GPU memory budgets.

This is still the same underlying Z-Image-Turbo model family and ONNX-based browser inference stack. The main packaging differences are the reordered monolithic transformer and the sharded transformer layout built from it.

Repository Layout

tokenizer/: tokenizer files used to turn prompt text into model inputs
onnx/text_encoder_model_q4f16.onnx + .onnx_data: text encoder that converts tokenized prompts into conditioning embeddings
onnx/vae_decoder_model_f16.onnx: VAE decoder that turns final latents into pixels
onnx/vae_pre_process_model_f16.onnx: VAE helper used in the decode path
onnx/scheduler_step_model_f16.onnx: scheduler update model used between denoising steps to produce the next latent sample
onnx/transformer_model_q4f16.onnx + .onnx_data: reordered / defragmented monolithic transformer kept as the base form used to build the shards
onnx/transformer_model_q4f16_shard*.onnx + matching .onnx_data: the main transformer split from the reordered monolithic form into multiple shards to reduce peak session size
onnx/transformer_shards.json: manifest describing the shard set

Notes

Base model license and usage terms come from the upstream Tongyi-MAI/Z-Image-Turbo release.
This repo focuses on browser-friendly packaging rather than training or architecture changes.

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for cretz/Z-Image-Turbo-ONNX-sharded

Base model

Tongyi-MAI/Z-Image-Turbo

Quantized

(38)

this model