Could you please suggest which Android library should be used to run this model?

#3
by samirsayyed - opened

I attempted to use the model with Google libraries LiteRT-LM (.litertlm) and LiteRT (.tflite), but it appears that the model is not supported.
Could you please guide me on how to properly set up this model in an Android application?

In version v0.9.0-beta (LiteRT-LM), loading the Qwen3.5 model (.litertlm) results in an error (the same error occurs when using the Kotlin API or the Galley app):

  Failed to initialize LiteRTLM Engine for model=qwen35_2b_model_multimodal.litertlm with runtime=0.9.0-beta (Ask Gemini)
com.google.ai.edge.litertlm.LiteRtLmJniException: Failed to create engine: INTERNAL: ERROR: [third_party/odml/litert_lm/runtime/executor/llm_litert_compiled_model_executor.cc:2173]
β”” ERROR: [third_party/odml/litert/litert/cc/litert_compiled_model.cc:319]
β”” ERROR: [third_party/odml/litert/litert/cc/litert_tensor_buffer.cc:51]
at com.google.ai.edge.litertlm.LiteRtLmJni.nativeCreateEngine(Native Method)
at com.google.ai.edge.litertlm.Engine.initialize(Engine.kt:72)
at com.example.litertlmdemo.MainActivity$initializeEngine$1.invokeSuspend(MainActivity.kt:262)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:34)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:101)
at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:113)
at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:89)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:589)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:823)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:720)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:707)

I pushed a new version yesterday maybe a bit before this comment, please pull the new model file.

When I get back to my machine, I can drop an example of how I am using the model in android.

Any updates on this?

I tried running the latest model_multimodal.litertlm, but I’m still encountering the same error with LiteRT-LM 0.9.0-beta.
@trevon , could you please share an example of how you’re using this model?

12:14:27.798 E [tensor_buffer.cc:129] Failed to get num packed bytes
12:14:27.940 E ModelManager: Init failed: Failed to create engine: INTERNAL: ERROR: [third_party/odml/litert_lm/runtime/executor/llm_litert_compiled_model_executor.cc:2173] (Fix with AI)
β”” ERROR: [third_party/odml/litert/litert/cc/litert_compiled_model.cc:319]
β”” ERROR: [third_party/odml/litert/litert/cc/litert_tensor_buffer.cc:51]
com.google.ai.edge.litertlm.LiteRtLmJniException: Failed to create engine: INTERNAL: ERROR: [third_party/odml/litert_lm/runtime/executor/llm_litert_compiled_model_executor.cc:2173]
β”” ERROR: [third_party/odml/litert/litert/cc/litert_compiled_model.cc:319]
β”” ERROR: [third_party/odml/litert/litert/cc/litert_tensor_buffer.cc:51]
at com.google.ai.edge.litertlm.LiteRtLmJni.nativeCreateEngine(Native Method)
at com.google.ai.edge.litertlm.Engine.initialize(Engine.kt:72)
at com.sam.nutrimind.ai.ModelManager$initialize$2.invokeSuspend(ModelManager.kt:600)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:34)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:100)
at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:124)
at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:89)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:586)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:820)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:717)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:704)

samirsayyed changed discussion status to closed

Sorry, I was testing the model with LiteRT-LM and saw that one of the layers actually was incorrectly converted resulting in gibberish when using the model with images. I have not forgot to respond and push the updated example, I am working to fix model that may solve the issue.

Trevon are you planning on adding the 9B model?

I can do that if there is interest.

I have INTEREST, will run it ASAP if I can get it to fit on my 16GB pixel 9 pro. I think I can fit it after disabling aicore

Any updates on this?

I tried (both 0.8b version and 2B version) in latest Edge Gallery and it crashes with
04-05 15:20:40.944 9423 9444 E tflite : Following operations are not supported by GPU delegate:
04-05 15:20:40.944 9423 9444 E tflite : ADD: Max version supported: 2. Requested version 4.
04-05 15:20:40.944 9423 9444 E tflite : BROADCAST_TO: Operation is not supported.
04-05 15:20:40.944 9423 9444 E tflite : CAST: Tensor type(INT64) is not supported. litert_qwen35_patch.apply_patches..LiteRTExportableModuleForQwen35DecodeExternalEmbedder/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5ForConditionalGeneration_model/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5Model_model/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5TextModel_language_model;15
04-05 15:20:40.944 9423 9444 E tflite : CAST: Tensor type(INT64) is not supported. litert_qwen35_patch.apply_patches..LiteRTExportableModuleForQwen35DecodeExternalEmbedder/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5ForConditionalGeneration_model/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5Model_model/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5TextModel_language_model;;litert_qwen35_patch.apply_patches..LiteRTExportableModuleForQwen35DecodeExternalEmbedder/transformers.models.qwen3_5.modeling_qwen3_5
04-05 15:20:40.944 9423 9444 I tflite : Replacing 35 out of 2241 node(s) with delegate (LITERT_CL) node, yielding 3 partitions for subgraph 0.
04-05 15:20:41.034 9423 9444 E tflite : Hint fully delegated to single delegate is set, but the graph is not fully delegated.
04-05 15:11:05.998 9423 9446 D AGModelManagerViewModel: Model 'Qwen3.5-0.8B-LiteRT' failed to initialize

Failed to create engine ... llm_litert_compiled_model_executor:1955 ... litert_compiled_model.h:836
in GPU mode
and
Failed to create engine: INTERNAL ERROR: [third_party/
odml/litert_lm/runtime/executor/
ltm_litert_compiled_model_executor.cc:1973]
L ERROR: [third_party/odml/litert/litert/cc/
litert_compiled_model.cc:321]
L ERROR: [third_party/odml/litert/litert/cc/
litert_tensor_buffer.cc:52]
in CPU mode

I tried other converted models:
https://huggingface.co/GabrieleConte/Qwen3.5-0.8B-LiteRT/blob/main/qwen35_mm_q8_ekv2048.litertlm
this one works on cpu mode
but the output looks a bit worse than .gguf variant on desktop

Sign up or log in to comment