1 281 37

jasonjiang

mikinyaa

jasonjiang8866

AI & ML interests

None yet

Recent Activity

upvoted an article 1 day ago

Using OCR models with llama.cpp

upvoted a paper 1 day ago

RAGEN-2: Reasoning Collapse in Agentic RL

liked a model 5 days ago

khazarai/Qwen3-4B-Qwen3.6-plus-Reasoning-Distilled-GGUF

View all activity

Organizations

None yet

upvoted an article 1 day ago

Article

Using OCR models with llama.cpp

2 days ago

•

upvoted a paper 1 day ago

RAGEN-2: Reasoning Collapse in Agentic RL

Paper • 2604.06268 • Published 6 days ago • 55

upvoted a paper 5 days ago

OpenWorldLib: A Unified Codebase and Definition of Advanced World Models

Paper • 2604.04707 • Published 7 days ago • 199

upvoted a paper 6 days ago

Self-Distilled RLVR

Paper • 2604.03128 • Published 10 days ago • 154

upvoted a paper 7 days ago

Think, Act, Build: An Agentic Framework with Vision Language Models for Zero-Shot 3D Visual Grounding

Paper • 2604.00528 • Published 12 days ago • 12

upvoted a paper 8 days ago

CORAL: Towards Autonomous Multi-Agent Evolution for Open-Ended Discovery

Paper • 2604.01658 • Published 11 days ago • 52

upvoted a paper 9 days ago

DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models

Paper • 2603.26164 • Published 16 days ago • 347

upvoted a paper 16 days ago

Rethinking Token-Level Policy Optimization for Multimodal Chain-of-Thought

Paper • 2603.22847 • Published 20 days ago • 25

upvoted a paper 17 days ago

UniGRPO: Unified Policy Optimization for Reasoning-Driven Visual Generation

Paper • 2603.23500 • Published 19 days ago • 35

upvoted 2 papers 19 days ago

On the Direction of RLVR Updates for LLM Reasoning: Identification and Exploitation

Paper • 2603.22117 • Published 20 days ago • 29

LongCat-Flash-Prover: Advancing Native Formal Reasoning via Agentic Tool-Integrated Reinforcement Learning

Paper • 2603.21065 • Published 22 days ago • 77

upvoted a paper 28 days ago

SLA2: Sparse-Linear Attention with Learnable Routing and QAT

Paper • 2602.12675 • Published Feb 13 • 58

upvoted 6 papers about 1 month ago

Memex(RL): Scaling Long-Horizon LLM Agents via Indexed Experience Memory

Paper • 2603.04257 • Published Mar 4 • 19

AgentVista: Evaluating Multimodal Agents in Ultra-Challenging Realistic Visual Scenarios

Paper • 2602.23166 • Published Feb 26 • 45

upvoted 2 papers about 2 months ago

Step 3.5 Flash: Open Frontier-Level Intelligence with 11B Active Parameters

Paper • 2602.10604 • Published Feb 11 • 194

Composition-RL: Compose Your Verifiable Prompts for Reinforcement Learning of Large Language Models

Paper • 2602.12036 • Published Feb 12 • 93

jasonjiang

AI & ML interests

Recent Activity

Organizations

mikinyaa's activity

Using OCR models with llama.cpp