Amsi-fin-o1.5 β€” fp16 MLX

MLX License HuggingFace

Full-precision (fp16) MLX conversion of AITRADER/Amsi-fin-o1.5 β€” a 9B parameter finance-focused Vision Language Model with reasoning and tool calling capabilities.

Model Description

Amsi-fin-o1.5 is a fine-tuned Qwen3.5-VL-9B model specialized for financial analysis. It combines:

  • Vision understanding β€” analyze charts, financial documents, and screenshots
  • Chain-of-thought reasoning β€” structured thinking via <think> tags
  • Tool calling β€” native function calling for trading systems and APIs
  • Long context β€” 262K token context window

Key Capabilities

Capability Details
Vision Chart analysis, document OCR, screenshot understanding
Reasoning <think> tag reasoning with extended chain-of-thought
Tool Calling <tool_call> format for API integration
Finance Options pricing, technical analysis, portfolio management
Context Length 262,144 tokens

Model Architecture

Component Specification
Architecture Qwen3.5-VL (Vision-Language)
Parameters ~9B
Text Layers 32 (mixed linear + full attention)
Text Hidden Size 4,096
Text Intermediate 12,288
Attention Heads 16 (4 KV heads)
Head Dim 256
Vision Depth 27 layers
Vision Hidden Size 1,152
Patch Size 16Γ—16, temporal 2
Vocab Size 248,320
Precision float16

Available Variants

Variant Size Bits/Weight Link
fp16 (this) ~18 GB 16 You are here
8-bit (affine) ~10 GB 8.86 AITRADER/Amsi-fin-o1.5-mxfp8-MLX
4-bit (affine) ~5.6 GB 5.06 AITRADER/Amsi-fin-o1.5-mxfp4-MLX

Installation

pip install mlx-vlm

Requires macOS 14+ and Apple Silicon (M1 or later).

Usage

Vision β€” CLI

python -m mlx_vlm.generate \
  --model AITRADER/Amsi-fin-o1.5-fp16-MLX \
  --image chart.png \
  --prompt "Analyze this chart. What trading signals do you see?" \
  --max-tokens 512

Vision β€” Python API

from mlx_vlm import load, generate
from mlx_vlm.prompt_utils import apply_chat_template
from mlx_vlm.utils import load_image

model, processor = load("AITRADER/Amsi-fin-o1.5-fp16-MLX")
image = load_image("chart.png")

messages = [
    {
        "role": "user",
        "content": [
            {"type": "image"},
            {"type": "text", "text": "What pattern is forming on this chart?"}
        ]
    }
]

prompt = apply_chat_template(processor, messages)
output = generate(model, processor, prompt, [image], max_tokens=512)
print(output)

Reasoning

The model supports extended reasoning via <think> tags:

messages = [
    {
        "role": "user",
        "content": "A stock is trading at $150. A call option with strike $155 "
                   "expires in 30 days. IV is 25%. Calculate the approximate "
                   "option price using Black-Scholes and explain your reasoning."
    }
]

prompt = apply_chat_template(processor, messages)
output = generate(model, processor, prompt, max_tokens=2048)
# Output will contain <think>...</think> reasoning followed by the answer

Tool Calling

tools = [
    {
        "type": "function",
        "function": {
            "name": "get_stock_price",
            "description": "Get current stock price",
            "parameters": {
                "type": "object",
                "properties": {
                    "symbol": {"type": "string", "description": "Ticker symbol"}
                },
                "required": ["symbol"]
            }
        }
    }
]

messages = [
    {"role": "user", "content": "What's the current price of AAPL?"}
]

prompt = apply_chat_template(processor, messages, tools=tools)
output = generate(model, processor, prompt, max_tokens=256)
# Model will generate: <tool_call><function=get_stock_price>{"symbol": "AAPL"}</function></tool_call>

Streaming Generation

from mlx_vlm import load, stream_generate
from mlx_vlm.prompt_utils import apply_chat_template

model, processor = load("AITRADER/Amsi-fin-o1.5-fp16-MLX")

messages = [{"role": "user", "content": "Explain covered call strategy."}]
prompt = apply_chat_template(processor, messages)

for token in stream_generate(model, processor, prompt, max_tokens=512):
    print(token, end="", flush=True)

Chat UI (Gradio)

python -m mlx_vlm.chat_ui --model AITRADER/Amsi-fin-o1.5-fp16-MLX

Hardware Requirements

Apple Silicon Unified Memory fp16 8-bit 4-bit
M1 / M2 16 GB ❌ ❌ βœ…
M1/M2 Pro 32 GB βœ… βœ… βœ…
M1/M2 Max 64 GB βœ… βœ… βœ…
M1/M2 Ultra 128+ GB βœ… βœ… βœ…
M3 24 GB βœ… βœ… βœ…
M3 Pro 36 GB βœ… βœ… βœ…
M3 Max 48–128 GB βœ… βœ… βœ…
M4 24–32 GB βœ… βœ… βœ…
M4 Pro 48 GB βœ… βœ… βœ…
M4 Max 64–128 GB βœ… βœ… βœ…

fp16 requires ~20 GB memory at inference. 4-bit runs comfortably on 16 GB devices.

Conversion Details

Converted using mlx-vlm v0.3.12:

python -m mlx_vlm.convert \
  --hf-path AITRADER/Amsi-fin-o1.5 \
  --mlx-path ./Amsi-fin-o1.5-fp16-MLX \
  --dtype float16

Vision tower weights merged from the Qwen3.5-VL-9B base model to ensure full vision capability.

Training Details

  • Base Model: Qwen3.5-VL-9B
  • Fine-tuning Focus: Financial analysis, options trading, technical analysis
  • Capabilities Added: Domain-specific finance reasoning, trading tool integration

Limitations

  • Primarily trained on English financial data
  • Should not be used as sole basis for trading decisions
  • Vision analysis works best with standard chart formats
  • Long-context performance may vary with very large documents

Citation

@misc{amsi-fin-o1.5,
  title={Amsi-fin-o1.5: Finance Vision Language Model},
  author={AITRADER},
  year={2025},
  url={https://huggingface.co/AITRADER/Amsi-fin-o1.5}
}

Acknowledgments

  • Qwen Team for the Qwen3.5-VL base model
  • MLX Team at Apple for the MLX framework
  • mlx-vlm by Prince Canuma for VLM conversion tooling

License

Apache 2.0 β€” see LICENSE

Downloads last month
9
Safetensors
Model size
9B params
Tensor type
BF16
Β·
MLX
Hardware compatibility
Log In to add your hardware

Quantized

Inference Providers NEW
This model isn't deployed by any Inference Provider. πŸ™‹ Ask for provider support

Model tree for AITRADER/Amsi-fin-o1.5-fp16-MLX

Finetuned
(1)
this model