🌟 Model Overview

POINTS-Seeker-8B is a state-of-the-art multimodal agentic search model built from scratch to overcome the epistemic limits of static parametric knowledge in LMMs. Rather than bolting search tools onto an existing LMM, POINTS-Seeker is natively trained with Agentic Seeding—a dedicated phase that instills the foundational precursors for agentic behaviors—and equipped with V-Fold, an adaptive history-aware compression scheme, effectively resolving the performance bottleneck of long-horizon interactions. POINTS-Seeker-8B achieves superior performance on long-horizon, knowledge-intensive visual reasoning tasks.

Getting Started

Run with Transformers

Please first install WePOINTS using the following command:

git clone https://github.com/WePOINTS/WePOINTS.git
cd ./WePOINTS
pip install -e .

from transformers import AutoModelForCausalLM, AutoTokenizer, Qwen2VLImageProcessor
import torch

user_prompt = "explain the image"  # replace with your instruction
image_path = 'your image path'
model_path = 'tencent/POINTS-Seeker'
model = AutoModelForCausalLM.from_pretrained(model_path,
                                             trust_remote_code=True,
                                             dtype=torch.bfloat16,
                                             device_map='cuda')
tokenizer = AutoTokenizer.from_pretrained(model_path, trust_remote_code=True)
image_processor = Qwen2VLImageProcessor.from_pretrained(model_path)
content = [
            dict(type='image', image=image_path),
            dict(type='text', text=user_prompt)
          ]
messages = [
        {
            'role': 'user',
            'content': content
        }
    ]
generation_config = {
        'max_new_tokens': 2048,
        'do_sample': False
    }
response = model.chat(
    messages,
    tokenizer,
    image_processor,
    generation_config
)
print(response)

Multimodal Agentic Search

Please refer to our github repo

Downloads last month: -

Safetensors

Model size

9B params

Tensor type

BF16

Model tree for tencent/POINTS-Seeker

Base model

Qwen/Qwen3-8B-Base

Finetuned

(374)

this model

Paper for tencent/POINTS-Seeker

VersaViT: Enhancing MLLM Vision Backbones via Task-Guided Optimization

Paper • 2602.09934 • Published Feb 10 • 1