Amsi-fin-o1.5 — 4-bit MLX

4-bit quantized (affine, 5.06 bits/weight) MLX conversion of AITRADER/Amsi-fin-o1.5 — a 9B parameter finance-focused Vision Language Model with reasoning and tool calling capabilities.

Model Description

Amsi-fin-o1.5 is a fine-tuned Qwen3.5-VL-9B model specialized for financial analysis. It combines:

Vision understanding — analyze charts, financial documents, and screenshots
Chain-of-thought reasoning — structured thinking via <think> tags
Tool calling — native function calling for trading systems and APIs
Long context — 262K token context window

This is the smallest variant, ideal for devices with 16 GB unified memory.

Key Capabilities

Capability	Details
Vision	Chart analysis, document OCR, screenshot understanding
Reasoning	`<think>` tag reasoning with extended chain-of-thought
Tool Calling	`<tool_call>` format for API integration
Finance	Options pricing, technical analysis, portfolio management
Context Length	262,144 tokens

Model Architecture

Component	Specification
Architecture	Qwen3.5-VL (Vision-Language)
Parameters	~9B
Text Layers	32 (mixed linear + full attention)
Text Hidden Size	4,096
Text Intermediate	12,288
Attention Heads	16 (4 KV heads)
Head Dim	256
Vision Depth	27 layers
Vision Hidden Size	1,152
Patch Size	16×16, temporal 2
Vocab Size	248,320
Quantization	4-bit affine (group_size=64)
Bits/Weight	5.06

Available Variants

Variant	Size	Bits/Weight	Link
fp16	~18 GB	16	AITRADER/Amsi-fin-o1.5-fp16-MLX
8-bit (affine)	~10 GB	8.86	AITRADER/Amsi-fin-o1.5-mxfp8-MLX
4-bit (this)	~5.6 GB	5.06	You are here

Installation

pip install mlx-vlm

Requires macOS 14+ and Apple Silicon (M1 or later).

Usage

Vision — CLI

python -m mlx_vlm.generate \
  --model AITRADER/Amsi-fin-o1.5-mxfp4-MLX \
  --image chart.png \
  --prompt "Analyze this chart. What trading signals do you see?" \
  --max-tokens 512

Vision — Python API

from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_image

model, processor = load("AITRADER/Amsi-fin-o1.5-mxfp4-MLX")
image = load_image("chart.png")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "What pattern is forming on this chart?"}
        ]
    }
]

prompt = apply_chat_template(processor, messages)
output = generate(model, processor, prompt, [image], max_tokens=512)
print(output)

Reasoning

The model supports extended reasoning via <think> tags:

messages = [
    {
        "role": "user",
        "content": "A stock is trading at $150. A call option with strike $155 "
                   "expires in 30 days. IV is 25%. Calculate the approximate "
                   "option price using Black-Scholes and explain your reasoning."
    }
]

prompt = apply_chat_template(processor, messages)
output = generate(model, processor, prompt, max_tokens=2048)
# Output will contain <think>...</think> reasoning followed by the answer

Tool Calling

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_stock_price",
            "description": "Get current stock price",
            "parameters": {
                "type": "object",
                "properties": {
                    "symbol": {"type": "string", "description": "Ticker symbol"}
                },
                "required": ["symbol"]
            }
        }
    }
]

messages = [
    {"role": "user", "content": "What's the current price of AAPL?"}
]

prompt = apply_chat_template(processor, messages, tools=tools)
output = generate(model, processor, prompt, max_tokens=256)
# Model will generate: <tool_call><function=get_stock_price>{"symbol": "AAPL"}</function></tool_call>

Streaming Generation

from mlx_vlm import load, stream_generate
from mlx_vlm.prompt_utils import apply_chat_template

model, processor = load("AITRADER/Amsi-fin-o1.5-mxfp4-MLX")

messages = [{"role": "user", "content": "Explain covered call strategy."}]
prompt = apply_chat_template(processor, messages)

for token in stream_generate(model, processor, prompt, max_tokens=512):
    print(token, end="", flush=True)

Chat UI (Gradio)

python -m mlx_vlm.chat_ui --model AITRADER/Amsi-fin-o1.5-mxfp4-MLX

Hardware Requirements

Apple Silicon	Unified Memory	fp16	8-bit	4-bit
M1 / M2	16 GB	❌	❌	✅
M1/M2 Pro	32 GB	✅	✅	✅
M1/M2 Max	64 GB	✅	✅	✅
M1/M2 Ultra	128+ GB	✅	✅	✅
M3	24 GB	✅	✅	✅
M3 Pro	36 GB	✅	✅	✅
M3 Max	48–128 GB	✅	✅	✅
M4	24–32 GB	✅	✅	✅
M4 Pro	48 GB	✅	✅	✅
M4 Max	64–128 GB	✅	✅	✅

4-bit runs comfortably on 16 GB devices — the most accessible variant.

Conversion Details

Converted using mlx-vlm v0.3.12:

python -m mlx_vlm.convert \
  --hf-path AITRADER/Amsi-fin-o1.5 \
  --mlx-path ./Amsi-fin-o1.5-mxfp4-MLX \
  --quantize --q-bits 4

Vision tower weights merged from the Qwen3.5-VL-9B base model to ensure full vision capability.

Training Details

Base Model: Qwen3.5-VL-9B
Fine-tuning Focus: Financial analysis, options trading, technical analysis
Capabilities Added: Domain-specific finance reasoning, trading tool integration

Limitations

Primarily trained on English financial data
Should not be used as sole basis for trading decisions
Vision analysis works best with standard chart formats
Long-context performance may vary with very large documents
4-bit quantization trades some precision for efficiency; for maximum accuracy use the fp16 variant

Citation

@misc{amsi-fin-o1.5,
  title={Amsi-fin-o1.5: Finance Vision Language Model},
  author={AITRADER},
  year={2025},
  url={https://huggingface.co/AITRADER/Amsi-fin-o1.5}
}

Acknowledgments

Qwen Team for the Qwen3.5-VL base model
MLX Team at Apple for the MLX framework
mlx-vlm by Prince Canuma for VLM conversion tooling

License

Apache 2.0 — see LICENSE

Downloads last month: 12

Safetensors

Model size

2B params

Tensor type

BF16

U32

MLX

Hardware compatibility

4-bit

Model tree for AITRADER/Amsi-fin-o1.5-mxfp4-MLX

Base model

Qwen/Qwen3.5-9B-Base

Finetuned

AITRADER/Amsi-fin-o1.5

Quantized

(3)

this model