# Stable Diffusion v1.5 converted to LiteRT This repository contains a LiteRT/TFLite export of the Hugging Face model `stable-diffusion-v1-5/stable-diffusion-v1-5`. ## Base variants - `fp32/`: reference float export used by `android-gpu` and `ios-coreml` - `int8/`: mixed bundle with fp32 text encoder fallback, PT2E dynamic int8 UNet, and fp32 VAE fallback ## Deployment profiles - `android-qnn-npu`: LiteRT Qualcomm AI Engine Direct (QNN) (android, preferred accelerator=NPU) - `android-gpu`: LiteRT GPU delegate (android, preferred accelerator=GPU) - `android-cpu`: LiteRT CPU/XNNPACK (android, preferred accelerator=CPU) - `ios-coreml`: LiteRT Core ML delegate (ios, preferred accelerator=CORE_ML) Profiles are emitted in `conversion_manifest.json` as manifest-level mappings onto the exported base variants. This avoids duplicating large model binaries while still letting each runtime pick backend-specific artifacts. ## Files per exported base variant - `text_encoder.tflite` - `unet.tflite` - `vae_decoder.tflite` ## Shared assets - `tokenizer/` - `scheduler/` - `configs/` - `configs/text_encoder_runtime_config.json` - `conversion_manifest.json` ## Notes - Stable Diffusion v1.5 is a multi-stage pipeline, so this export is split into submodels. - The notebook first tries to export the text encoder with INT32 token ids for better GPU/Core ML delegate compatibility and records the actual exported input dtype per variant and per deployment profile. - The fp32 bundle is optional debug output; on CPU runtimes it is skipped by default to avoid kernel deaths during fp32 UNet conversion. - `android-qnn-npu` is a LiteRT/QNN-oriented deployment profile, not a Qualcomm AOT context binary. - Both exported base variants are smoke-tested by reloading the serialized LiteRT models and executing inference. - The preview images in `preview/` are decoder smoke tests, not final text-to-image samples.