Could you please suggest which Android library should be used to run this model?

by samirsayyed - opened Mar 15

Mar 15

I attempted to use the model with Google libraries LiteRT-LM (.litertlm) and LiteRT (.tflite), but it appears that the model is not supported.
Could you please guide me on how to properly set up this model in an Android application?

xiahao2

Mar 18

In version v0.9.0-beta (LiteRT-LM), loading the Qwen3.5 model (.litertlm) results in an error (the same error occurs when using the Kotlin API or the Galley app):

  Failed to initialize LiteRTLM Engine for model=qwen35_2b_model_multimodal.litertlm with runtime=0.9.0-beta (Ask Gemini)
com.google.ai.edge.litertlm.LiteRtLmJniException: Failed to create engine: INTERNAL: ERROR: [third_party/odml/litert_lm/runtime/executor/llm_litert_compiled_model_executor.cc:2173]
└ ERROR: [third_party/odml/litert/litert/cc/litert_compiled_model.cc:319]
└ ERROR: [third_party/odml/litert/litert/cc/litert_tensor_buffer.cc:51]
at com.google.ai.edge.litertlm.LiteRtLmJni.nativeCreateEngine(Native Method)
at com.google.ai.edge.litertlm.Engine.initialize(Engine.kt:72)
at com.example.litertlmdemo.MainActivity$initializeEngine$1.invokeSuspend(MainActivity.kt:262)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:34)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:101)
at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:113)
at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:89)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:589)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:823)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:720)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:707)

trevon

Mar 18

•

edited Mar 18

I pushed a new version yesterday maybe a bit before this comment, please pull the new model file.

When I get back to my machine, I can drop an example of how I am using the model in android.

lefkall

28 days ago

Any updates on this?

samirsayyed

26 days ago

•

edited 26 days ago

I tried running the latest model_multimodal.litertlm, but I’m still encountering the same error with LiteRT-LM 0.9.0-beta.
@trevon , could you please share an example of how you’re using this model?

12:14:27.798 E [tensor_buffer.cc:129] Failed to get num packed bytes
12:14:27.940 E ModelManager: Init failed: Failed to create engine: INTERNAL: ERROR: [third_party/odml/litert_lm/runtime/executor/llm_litert_compiled_model_executor.cc:2173] (Fix with AI)
└ ERROR: [third_party/odml/litert/litert/cc/litert_compiled_model.cc:319]
└ ERROR: [third_party/odml/litert/litert/cc/litert_tensor_buffer.cc:51]
com.google.ai.edge.litertlm.LiteRtLmJniException: Failed to create engine: INTERNAL: ERROR: [third_party/odml/litert_lm/runtime/executor/llm_litert_compiled_model_executor.cc:2173]
└ ERROR: [third_party/odml/litert/litert/cc/litert_compiled_model.cc:319]
└ ERROR: [third_party/odml/litert/litert/cc/litert_tensor_buffer.cc:51]
at com.google.ai.edge.litertlm.LiteRtLmJni.nativeCreateEngine(Native Method)
at com.google.ai.edge.litertlm.Engine.initialize(Engine.kt:72)
at com.sam.nutrimind.ai.ModelManager$initialize$2.invokeSuspend(ModelManager.kt:600)
at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:34)
at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:100)
at kotlinx.coroutines.internal.LimitedDispatcher$Worker.run(LimitedDispatcher.kt:124)
at kotlinx.coroutines.scheduling.TaskImpl.run(Tasks.kt:89)
at kotlinx.coroutines.scheduling.CoroutineScheduler.runSafely(CoroutineScheduler.kt:586)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.executeTask(CoroutineScheduler.kt:820)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.runWorker(CoroutineScheduler.kt:717)
at kotlinx.coroutines.scheduling.CoroutineScheduler$Worker.run(CoroutineScheduler.kt:704)

samirsayyed changed discussion status to closed 26 days ago

trevon

26 days ago

•

edited 26 days ago

Sorry, I was testing the model with LiteRT-LM and saw that one of the layers actually was incorrectly converted resulting in gibberish when using the model with images. I have not forgot to respond and push the updated example, I am working to fix model that may solve the issue.

CyberMagician

25 days ago

Trevon are you planning on adding the 9B model?

trevon

25 days ago

I can do that if there is interest.

CyberMagician

25 days ago

I have INTEREST, will run it ASAP if I can get it to fit on my 16GB pixel 9 pro. I think I can fit it after disabling aicore

hpkim0512

17 days ago

Any updates on this?

ceoofcapybaras

15 days ago

•

edited 15 days ago

I tried (both 0.8b version and 2B version) in latest Edge Gallery and it crashes with
04-05 15:20:40.944 9423 9444 E tflite : Following operations are not supported by GPU delegate:
04-05 15:20:40.944 9423 9444 E tflite : ADD: Max version supported: 2. Requested version 4.
04-05 15:20:40.944 9423 9444 E tflite : BROADCAST_TO: Operation is not supported.
04-05 15:20:40.944 9423 9444 E tflite : CAST: Tensor type(INT64) is not supported. litert_qwen35_patch.apply_patches..LiteRTExportableModuleForQwen35DecodeExternalEmbedder/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5ForConditionalGeneration_model/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5Model_model/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5TextModel_language_model;15
04-05 15:20:40.944 9423 9444 E tflite : CAST: Tensor type(INT64) is not supported. litert_qwen35_patch.apply_patches..LiteRTExportableModuleForQwen35DecodeExternalEmbedder/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5ForConditionalGeneration_model/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5Model_model/transformers.models.qwen3_5.modeling_qwen3_5.Qwen3_5TextModel_language_model;;litert_qwen35_patch.apply_patches..LiteRTExportableModuleForQwen35DecodeExternalEmbedder/transformers.models.qwen3_5.modeling_qwen3_5
04-05 15:20:40.944 9423 9444 I tflite : Replacing 35 out of 2241 node(s) with delegate (LITERT_CL) node, yielding 3 partitions for subgraph 0.
04-05 15:20:41.034 9423 9444 E tflite : Hint fully delegated to single delegate is set, but the graph is not fully delegated.
04-05 15:11:05.998 9423 9446 D AGModelManagerViewModel: Model 'Qwen3.5-0.8B-LiteRT' failed to initialize

Failed to create engine ... llm_litert_compiled_model_executor:1955 ... litert_compiled_model.h:836
in GPU mode
and
Failed to create engine: INTERNAL ERROR: [third_party/
odml/litert_lm/runtime/executor/
ltm_litert_compiled_model_executor.cc:1973]
L ERROR: [third_party/odml/litert/litert/cc/
litert_compiled_model.cc:321]
L ERROR: [third_party/odml/litert/litert/cc/
litert_tensor_buffer.cc:52]
in CPU mode

I tried other converted models:
https://huggingface.co/GabrieleConte/Qwen3.5-0.8B-LiteRT/blob/main/qwen35_mm_q8_ekv2048.litertlm
this one works on cpu mode
but the output looks a bit worse than .gguf variant on desktop

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment