Amsi-fin-o1.5 β 4-bit MLX
4-bit quantized (affine, 5.06 bits/weight) MLX conversion of AITRADER/Amsi-fin-o1.5 β a 9B parameter finance-focused Vision Language Model with reasoning and tool calling capabilities.
Model Description
Amsi-fin-o1.5 is a fine-tuned Qwen3.5-VL-9B model specialized for financial analysis. It combines:
- Vision understanding β analyze charts, financial documents, and screenshots
- Chain-of-thought reasoning β structured thinking via
<think>tags - Tool calling β native function calling for trading systems and APIs
- Long context β 262K token context window
This is the smallest variant, ideal for devices with 16 GB unified memory.
Key Capabilities
| Capability | Details |
|---|---|
| Vision | Chart analysis, document OCR, screenshot understanding |
| Reasoning | <think> tag reasoning with extended chain-of-thought |
| Tool Calling | <tool_call> format for API integration |
| Finance | Options pricing, technical analysis, portfolio management |
| Context Length | 262,144 tokens |
Model Architecture
| Component | Specification |
|---|---|
| Architecture | Qwen3.5-VL (Vision-Language) |
| Parameters | ~9B |
| Text Layers | 32 (mixed linear + full attention) |
| Text Hidden Size | 4,096 |
| Text Intermediate | 12,288 |
| Attention Heads | 16 (4 KV heads) |
| Head Dim | 256 |
| Vision Depth | 27 layers |
| Vision Hidden Size | 1,152 |
| Patch Size | 16Γ16, temporal 2 |
| Vocab Size | 248,320 |
| Quantization | 4-bit affine (group_size=64) |
| Bits/Weight | 5.06 |
Available Variants
| Variant | Size | Bits/Weight | Link |
|---|---|---|---|
| fp16 | ~18 GB | 16 | AITRADER/Amsi-fin-o1.5-fp16-MLX |
| 8-bit (affine) | ~10 GB | 8.86 | AITRADER/Amsi-fin-o1.5-mxfp8-MLX |
| 4-bit (this) | ~5.6 GB | 5.06 | You are here |
Installation
pip install mlx-vlm
Requires macOS 14+ and Apple Silicon (M1 or later).
Usage
Vision β CLI
python -m mlx_vlm.generate \
--model AITRADER/Amsi-fin-o1.5-mxfp4-MLX \
--image chart.png \
--prompt "Analyze this chart. What trading signals do you see?" \
--max-tokens 512
Vision β Python API
from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_image
model, processor = load("AITRADER/Amsi-fin-o1.5-mxfp4-MLX")
image = load_image("chart.png")
messages = [
{
"role": "user",
"content": [
{"type": "image"},
{"type": "text", "text": "What pattern is forming on this chart?"}
]
}
]
prompt = apply_chat_template(processor, messages)
output = generate(model, processor, prompt, [image], max_tokens=512)
print(output)
Reasoning
The model supports extended reasoning via <think> tags:
messages = [
{
"role": "user",
"content": "A stock is trading at $150. A call option with strike $155 "
"expires in 30 days. IV is 25%. Calculate the approximate "
"option price using Black-Scholes and explain your reasoning."
}
]
prompt = apply_chat_template(processor, messages)
output = generate(model, processor, prompt, max_tokens=2048)
# Output will contain <think>...</think> reasoning followed by the answer
Tool Calling
tools = [
{
"type": "function",
"function": {
"name": "get_stock_price",
"description": "Get current stock price",
"parameters": {
"type": "object",
"properties": {
"symbol": {"type": "string", "description": "Ticker symbol"}
},
"required": ["symbol"]
}
}
}
]
messages = [
{"role": "user", "content": "What's the current price of AAPL?"}
]
prompt = apply_chat_template(processor, messages, tools=tools)
output = generate(model, processor, prompt, max_tokens=256)
# Model will generate: <tool_call><function=get_stock_price>{"symbol": "AAPL"}</function></tool_call>
Streaming Generation
from mlx_vlm import load, stream_generate
from mlx_vlm.prompt_utils import apply_chat_template
model, processor = load("AITRADER/Amsi-fin-o1.5-mxfp4-MLX")
messages = [{"role": "user", "content": "Explain covered call strategy."}]
prompt = apply_chat_template(processor, messages)
for token in stream_generate(model, processor, prompt, max_tokens=512):
print(token, end="", flush=True)
Chat UI (Gradio)
python -m mlx_vlm.chat_ui --model AITRADER/Amsi-fin-o1.5-mxfp4-MLX
Hardware Requirements
| Apple Silicon | Unified Memory | fp16 | 8-bit | 4-bit |
|---|---|---|---|---|
| M1 / M2 | 16 GB | β | β | β |
| M1/M2 Pro | 32 GB | β | β | β |
| M1/M2 Max | 64 GB | β | β | β |
| M1/M2 Ultra | 128+ GB | β | β | β |
| M3 | 24 GB | β | β | β |
| M3 Pro | 36 GB | β | β | β |
| M3 Max | 48β128 GB | β | β | β |
| M4 | 24β32 GB | β | β | β |
| M4 Pro | 48 GB | β | β | β |
| M4 Max | 64β128 GB | β | β | β |
4-bit runs comfortably on 16 GB devices β the most accessible variant.
Conversion Details
Converted using mlx-vlm v0.3.12:
python -m mlx_vlm.convert \
--hf-path AITRADER/Amsi-fin-o1.5 \
--mlx-path ./Amsi-fin-o1.5-mxfp4-MLX \
--quantize --q-bits 4
Vision tower weights merged from the Qwen3.5-VL-9B base model to ensure full vision capability.
Training Details
- Base Model: Qwen3.5-VL-9B
- Fine-tuning Focus: Financial analysis, options trading, technical analysis
- Capabilities Added: Domain-specific finance reasoning, trading tool integration
Limitations
- Primarily trained on English financial data
- Should not be used as sole basis for trading decisions
- Vision analysis works best with standard chart formats
- Long-context performance may vary with very large documents
- 4-bit quantization trades some precision for efficiency; for maximum accuracy use the fp16 variant
Citation
@misc{amsi-fin-o1.5,
title={Amsi-fin-o1.5: Finance Vision Language Model},
author={AITRADER},
year={2025},
url={https://huggingface.co/AITRADER/Amsi-fin-o1.5}
}
Acknowledgments
- Qwen Team for the Qwen3.5-VL base model
- MLX Team at Apple for the MLX framework
- mlx-vlm by Prince Canuma for VLM conversion tooling
License
Apache 2.0 β see LICENSE
- Downloads last month
- 12
Model size
2B params
Tensor type
BF16
Β·
U32 Β·
Hardware compatibility
Log In to add your hardware
4-bit