| --- |
| base_model: |
| - Qwen/Qwen3.5-0.8B |
| tags: |
| - embedl |
| - qwen3.5 |
| - multimodal |
| - vlm |
| - flashhead |
| - llmcompressor |
| - qwen3_5 |
| pipeline_tag: image-text-to-text |
| license: other |
| license_name: embedl-models-community-licence-1.0 |
| license_link: https://github.com/embedl/embedl-models/blob/main/LICENSE |
| extra_gated_prompt: >- |
| The information you provide will be collected, stored, processed and shared in accordance |
| with the [Embedl Privacy Policy](https://www.embedl.com/privacy-policy). |
| extra_gated_fields: |
| Company: text |
| --- |
| <!-- embedl-banner:start --> |
| <style> |
| .embedl-btn-primary { transition: background 160ms ease, box-shadow 160ms ease; } |
| .embedl-btn-primary:hover { background: #4FDCE4 !important; box-shadow: 0 8px 22px rgba(45,212,221,0.45) !important; } |
| .embedl-btn-secondary { transition: background 160ms ease; } |
| .embedl-btn-secondary:hover { background: rgba(45,212,221,0.15) !important; } |
| .embedl-headline { font-size: clamp(11px, 2.15vw, 15px) !important; } |
| .embedl-btn-primary, .embedl-btn-secondary { |
| font-size: clamp(11px, 1.65vw, 13px) !important; |
| padding: clamp(6px, 1.1vw, 9px) clamp(10px, 1.6vw, 14px) !important; |
| } |
| </style> |
| <div style="background:radial-gradient(600px 220px at 0% 50%,rgba(45,212,221,0.22) 0%,rgba(45,212,221,0) 60%),radial-gradient(400px 180px at 100% 100%,rgba(45,212,221,0.10) 0%,rgba(45,212,221,0) 55%),linear-gradient(135deg,#0B1626 0%,#142338 100%);border:1px solid rgba(45,212,221,0.28);border-radius:12px;padding:22px 24px;margin:0 0 24px 0;color:#F2F6FA;box-shadow:0 4px 16px rgba(11,22,38,0.18);overflow:hidden;box-sizing:border-box;max-width:100%;"> |
| <table style="width:100%;border-collapse:collapse;border:0;background:transparent;"> |
| <tr style="background:transparent;"> |
| <td style="vertical-align:middle;border:0;padding:0;background:transparent;"> |
| <div style="display:inline-block;font-size:10px;letter-spacing:0.08em;text-transform:uppercase;font-weight:700;color:#2DD4DD;background:rgba(45,212,221,0.15);border:1px solid rgba(45,212,221,0.35);padding:4px 10px;border-radius:999px;margin-bottom:10px;white-space:nowrap;">Optimized by Embedl</div> |
| <div class="embedl-headline" style="font-size:15px;font-weight:700;line-height:1.35;color:#F2F6FA;margin-bottom:4px;">Need to <span style="color:#2DD4DD;white-space:nowrap;">fine-tune</span>, hit <span style="color:#2DD4DD;white-space:nowrap;">performance targets</span>, or deploy on <span style="color:#2DD4DD;white-space:nowrap;">specific hardware</span>?</div> |
| <div style="font-size:13px;color:#9BA7B5;">We've got you covered.</div> |
| </td> |
| <td width="1%" style="vertical-align:middle;border:0;padding:0 0 0 18px;white-space:nowrap;text-align:right;background:transparent;"> |
| <a href="https://www.embedl.com/models" class="embedl-btn-secondary" style="display:inline-block;font-size:13px;font-weight:600;padding:9px 14px;border-radius:6px;border:1px solid #2DD4DD;color:#2DD4DD;text-decoration:none;margin-right:8px;">Learn more</a> |
| <a href="https://www.embedl.com/contact" class="embedl-btn-primary" style="display:inline-block;font-size:13px;font-weight:600;padding:9px 14px;border-radius:6px;border:1px solid #2DD4DD;background:#2DD4DD;color:#0B1626;text-decoration:none;box-shadow:0 6px 18px rgba(45,212,221,0.28);">Get in touch →</a> |
| </td> |
| </tr> |
| </table> |
| </div> |
| <!-- embedl-banner:end --> |
| |
| # Qwen3.5-0.8B-FlashHead |
|
|
| [](https://github.com/embedl/flash-head) |
|
|
| **Optimized version of [Qwen/Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) using FlashHead, Embedl's efficient replacement for the language model head.** |
|
|
| This model adds **FlashHead**, a lightweight replacement for the dense LM head that significantly improves throughput while preserving accuracy. Weights are kept in **FP16** precision. |
|
|
| The model preserves Text + Image / Video -> Text behavior and reasoning capabilities while improving inference throughput. |
|
|
| FlashHead is available as a vLLM plugin via `pip install flash-head`. |
|
|
| --- |
|
|
| ## Model Details |
|
|
| | **Field** | **Value** | |
| |--------------------|-----------| |
| | **Model** | [embedl/Qwen3.5-0.8B-FlashHead](https://huggingface.co/embedl/Qwen3.5-0.8B-FlashHead) | |
| | **Base Model** | [Qwen/Qwen3.5-0.8B](https://huggingface.co/Qwen/Qwen3.5-0.8B) | |
| | **Input / Output** | Text + Image / Video -> Text | |
| | **Version** | 1.0 | |
| | **Optimizations** | FlashHead LM Head | |
| | **Developers** | Embedl | |
| | **Licenses** | Upstream: [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md). <br>Optimized components: Embedl Models Community Licence v1.0 *(no redistribution)* | |
| | **Intended Use** | Text generation, reasoning, assistant-style interaction, video analytics, and general-purpose multimodal NLP on NVIDIA GPUs | |
|
|
| --- |
|
|
| ## Optimizations |
|
|
| - **FlashHead LM Head**: Lightweight replacement for the dense LM head, significantly improving throughput. |
|
|
| --- |
|
|
| ## Benchmarks |
|
|
| <a href="https://huggingface.co/spaces/embedl/Edge-Inference-Benchmarks" target="_blank" rel="noopener"> |
| <img |
| src="https://huggingface.co/datasets/embedl/documentation-images/resolve/main/Edge-Inference-Benchmarks/Qwen3.5__agx_orin.svg" |
| alt="Edge Inference Benchmarks for Qwen3.5" |
| width="100%" |
| /> |
| </a> |
|
|
| --- |
|
|
| ## Installation |
|
|
| ```bash |
| pip install flash-head |
| ``` |
|
|
| The [`flash-head`](https://github.com/embedl/flash-head) vLLM plugin is required. It activates automatically at startup. |
|
|
|
|
| ## License |
|
|
| This model is a derivative of **Qwen/Qwen3.5-0.8B**. |
|
|
| - **Upstream:** [Apache License 2.0](https://huggingface.co/datasets/choosealicense/licenses/blob/main/markdown/apache-2.0.md) |
| - **Optimized Components:** Embedl Models Community Licence v1.0 *(no redistribution)* |
|
|
| --- |
|
|
| ## Contact |
|
|
| - Enterprise and Commercial Inquiries: `models@embedl.com` |
| - Technical Issues and Early Access: [`https://github.com/embedl/flash-head`](https://github.com/embedl/flash-head) |
| - More Information and Model Releases: `https://embedl.com` |
|
|
| ### Partner & Developer Opportunities |
|
|
| If you are evaluating on-device inference, building products on this model, or exploring custom model optimization, reach out for: |
|
|
| - Engineering support for on-prem and edge deployments |
| - Early access and partner co-marketing opportunities |
|
|
| Contact: `models@embedl.com` |
|
|
| <!-- embedl-discord-banner:start --> |
| <style> |
| .embedl-discord-btn { transition: background 160ms ease, box-shadow 160ms ease; } |
| .embedl-discord-btn:hover { background: #6C77F5 !important; box-shadow: 0 8px 22px rgba(88,101,242,0.55) !important; } |
| </style> |
| <div style="background:radial-gradient(600px 220px at 0% 50%,rgba(88,101,242,0.22) 0%,rgba(88,101,242,0) 60%),radial-gradient(400px 180px at 100% 100%,rgba(88,101,242,0.10) 0%,rgba(88,101,242,0) 55%),linear-gradient(135deg,#0B1626 0%,#142338 100%);border:1px solid rgba(88,101,242,0.35);border-radius:12px;padding:22px 24px;margin:24px 0 0 0;color:#F2F6FA;box-shadow:0 4px 16px rgba(11,22,38,0.18);overflow:hidden;box-sizing:border-box;max-width:100%;"> |
| <table style="width:100%;border-collapse:collapse;border:0;background:transparent;"> |
| <tr style="background:transparent;"> |
| <td style="vertical-align:middle;border:0;padding:0;background:transparent;"> |
| <div style="display:inline-block;font-size:10px;letter-spacing:0.08em;text-transform:uppercase;font-weight:700;color:#A5B4FC;background:rgba(88,101,242,0.18);border:1px solid rgba(88,101,242,0.45);padding:4px 10px;border-radius:999px;margin-bottom:10px;white-space:nowrap;">Community & support</div> |
| <div style="font-size:15px;font-weight:700;line-height:1.35;color:#F2F6FA;margin-bottom:4px;">Need help with this model? Chat with the Embedl team and other engineers on <span style="color:#A5B4FC;white-space:nowrap;">Discord</span>.</div> |
| <div style="font-size:13px;color:#9BA7B5;">Quantization gotchas, hardware questions, fine-tuning tips — bring them all.</div> |
| </td> |
| <td width="1%" style="vertical-align:middle;border:0;padding:0 0 0 18px;white-space:nowrap;text-align:right;background:transparent;"> |
| <a href="https://discord.gg/MTbMWdKqE" class="embedl-discord-btn" style="display:inline-block;font-size:13px;font-weight:600;padding:9px 14px;border-radius:6px;border:1px solid #5865F2;background:#5865F2;color:#FFFFFF;text-decoration:none;box-shadow:0 6px 18px rgba(88,101,242,0.35);">Join our Discord →</a> |
| </td> |
| </tr> |
| </table> |
| </div> |
| <!-- embedl-discord-banner:end --> |
| |