gemma4-E2B-it-litert-128k-mtp

Experimental .litertlm bundle for Gemma 4 E2B derived from the LiteRT community artifact and patched for:

  • max_num_tokens: 131072
  • MTP / speculative decoding retained
  • native LiteRT-LM runtime compatibility

Status

Verified on host 100.96.1.7:

  • short-prompt inference works on CPU
  • speculative decoding works
  • LiteRT-LM logs show target_number=131072
  • long-context prefill starts correctly beyond 32k without fallback

Artifact

  • file: model.litertlm
  • sha256: 274e5c461e754cbd05423bab734e7765d9757443fb5591edb7aba6f9f186550a
  • size_bytes: 2584805376

Notes

Experimental CTOX integration artifact.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support