Johann-Peter Hartmann PRO
johannhartmann
AI & ML interests
LLMs, Local LLMs, Transformers, Image Processing, Audio Processing, E-Commerce
Recent Activity
liked a model 8 days ago
tencent/HY-OmniWeaving liked a model about 1 month ago
zeroentropy/zembed-1 updated a model about 2 months ago
mayflowergmbh/bert-german-ler-onnx-int4Organizations
Document & UI Intelligence
-
xlangai/Aguvis-7B-720P
8B • Updated • 48 • 9 -
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Paper • 2412.04454 • Published • 71 -
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Paper • 2401.10935 • Published • 5 -
cckevinn/SeeClick
Text Generation • 10B • Updated • 182k • 18
Medical MultiModal
Multimodal models that have been trained on medical datasets.
Music
Computer Use Models
Document & UI Intelligence
-
xlangai/Aguvis-7B-720P
8B • Updated • 48 • 9 -
Aguvis: Unified Pure Vision Agents for Autonomous GUI Interaction
Paper • 2412.04454 • Published • 71 -
SeeClick: Harnessing GUI Grounding for Advanced Visual GUI Agents
Paper • 2401.10935 • Published • 5 -
cckevinn/SeeClick
Text Generation • 10B • Updated • 182k • 18
Multimodal Models
A collection of multimodal models for the gpu poor
Medical MultiModal
Multimodal models that have been trained on medical datasets.