TongZheng's picture

TongZheng PRO

TongZheng1999

·

https://kidzheng.github.io/

AI & ML interests

Natural Language Processing

Recent Activity

updated a model 7 days ago

TongZheng1999/Final-Reasoning-4B-Iter1-Strong-Init-Filtered-RB-by-Judge

published a model 7 days ago

TongZheng1999/Final-Reasoning-4B-Iter1-Strong-Init-Filtered-RB-by-Judge

updated a dataset 8 days ago

TongZheng1999/iter_1_reinforce_baseline_per_sample_200epoch_strong_init_step_150_processed_Merge_f_by_judge

View all activity

Organizations

upvoted a paper 27 days ago

MetaClaw: Just Talk -- An Agent That Meta-Learns and Evolves in the Wild

Paper • 2603.17187 • Published Mar 17 • 138

upvoted 8 papers 2 months ago

PhyCritic: Multimodal Critic Models for Physical AI

Paper • 2602.11124 • Published Feb 11 • 55

OPE: Overcoming Information Saturation in Parallel Thinking via Outline-Guided Path Exploration

Paper • 2602.08344 • Published Feb 9 • 5

SkillRL: Evolving Agents via Recursive Skill-Augmented Reinforcement Learning

Paper • 2602.08234 • Published Feb 9 • 74

Training Data Efficiency in Multimodal Process Reward Models

Paper • 2602.04145 • Published Feb 4 • 79

CoBA-RL: Capability-Oriented Budget Allocation for Reinforcement Learning in LLMs

Paper • 2602.03048 • Published Feb 3 • 32

Learning Query-Specific Rubrics from Human Preferences for DeepResearch Report Generation

Paper • 2602.03619 • Published Feb 3 • 28

Parallel-Probe: Towards Efficient Parallel Thinking via 2D Probing

Paper • 2602.03845 • Published Feb 3 • 27

TTCS: Test-Time Curriculum Synthesis for Self-Evolving

Paper • 2601.22628 • Published Jan 30 • 35

upvoted 3 papers 3 months ago

PaCoRe: Learning to Scale Test-Time Compute with Parallel Coordinated Reasoning

Paper • 2601.05593 • Published Jan 9 • 86

RelayLLM: Efficient Reasoning via Collaborative Decoding

Paper • 2601.05167 • Published Jan 8 • 31

Benchmark^2: Systematic Evaluation of LLM Benchmarks

Paper • 2601.03986 • Published Jan 7 • 34

upvoted a paper 4 months ago

Native Parallel Reasoner: Reasoning in Parallelism via Self-Distilled Reinforcement Learning

Paper • 2512.07461 • Published Dec 8, 2025 • 79

upvoted 4 papers 5 months ago

Multi-Crit: Benchmarking Multimodal Judges on Pluralistic Criteria-Following

Paper • 2511.21662 • Published Nov 26, 2025 • 11

First Frame Is the Place to Go for Video Content Customization

Paper • 2511.15700 • Published Nov 19, 2025 • 54

VisPlay: Self-Evolving Vision-Language Models from Images

Paper • 2511.15661 • Published Nov 19, 2025 • 44

Beyond English: Toward Inclusive and Scalable Multilingual Machine Translation with LLMs

Paper • 2511.07003 • Published Nov 10, 2025 • 35

upvoted 3 papers 6 months ago

The Era of Agentic Organization: Learning to Organize with Language Models

Paper • 2510.26658 • Published Oct 30, 2025 • 29

StatEval: A Comprehensive Benchmark for Large Language Models in Statistics

Paper • 2510.09517 • Published Oct 10, 2025 • 8

NewtonBench: Benchmarking Generalizable Scientific Law Discovery in LLM Agents

Paper • 2510.07172 • Published Oct 8, 2025 • 28