Garry Osborne's picture

Garry Osborne

garryo

·

https://tomorrowsinnovations.co

Garry-TI

AI & ML interests

AI/ML | AI Audio Models | AI LLMs | AI VLMs | AI Video Models

Recent Activity

upvoted a paper 3 days ago

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

upvoted a paper 12 days ago

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

upvoted a paper 16 days ago

View all activity

Organizations

upvoted a paper 3 days ago

SkillClaw: Let Skills Evolve Collectively with Agentic Evolver

Paper • 2604.08377 • Published 7 days ago • 274

upvoted a paper 12 days ago

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Paper • 2603.26164 • Published 19 days ago • 351

upvoted 2 papers 16 days ago

Voxtral TTS

Paper • 2603.25551 • Published 20 days ago • 59

Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale

Paper • 2603.25040 • Published 20 days ago • 131

upvoted a paper 21 days ago

From Static Templates to Dynamic Runtime Graphs: A Survey of Workflow Optimization for LLM Agents

Paper • 2603.22386 • Published 23 days ago • 55

upvoted 10 papers 27 days ago

LaDe: Unified Multi-Layered Graphic Media Generation and Decomposition

Paper • 2603.17965 • Published 28 days ago • 6

Unified Spatio-Temporal Token Scoring for Efficient Video VLMs

Paper • 2603.18004 • Published 28 days ago • 13

Stereo World Model: Camera-Guided Stereo Video Generation

Paper • 2603.17375 • Published 28 days ago • 11

AdaMem: Adaptive User-Centric Memory for Long-Horizon Dialogue Agents

Paper • 2603.16496 • Published 29 days ago • 13

LoST: Level of Semantics Tokenization for 3D Shapes

Paper • 2603.17995 • Published 28 days ago • 31

Temporal Gains, Spatial Costs: Revisiting Video Fine-Tuning in Multimodal Large Language Models

Paper • 2603.17541 • Published 28 days ago • 20

Look Before Acting: Enhancing Vision Foundation Representations for Vision-Language-Action Models

Paper • 2603.15618 • Published 30 days ago • 21

MosaicMem: Hybrid Spatial Memory for Controllable Video World Models

Paper • 2603.17117 • Published 29 days ago • 87

Video-CoE: Reinforcing Video Event Prediction via Chain of Events

Paper • 2603.14935 • Published about 1 month ago • 91

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published 29 days ago • 138

upvoted 3 papers 28 days ago

FinToolBench: Evaluating LLM Agents for Real-World Financial Tool Use

Paper • 2603.08262 • Published Mar 9 • 42

TRUST-SQL: Tool-Integrated Multi-Turn Reinforcement Learning for Text-to-SQL over Unknown Schemas

Paper • 2603.16448 • Published 29 days ago • 58

InCoder-32B: Code Foundation Model for Industrial Scenarios

Paper • 2603.16790 • Published 29 days ago • 308

upvoted 2 papers 29 days ago

Spatial-TTT: Streaming Visual-based Spatial Intelligence with Test-Time Training

Paper • 2603.12255 • Published Mar 12 • 91

ShotVerse: Advancing Cinematic Camera Control for Text-Driven Multi-Shot Video Creation

Paper • 2603.11421 • Published Mar 12 • 34