Pavel's Lab

university

https://izmailovpavel.github.io/

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

Evangelinejy authored a paper about 2 hours ago

Rethinking Diverse Human Preference Learning through Principal Component Analysis

Evangelinejy authored a paper about 2 hours ago

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

Evangelinejy authored a paper about 2 hours ago

When Reasoning Meets Its Laws

View all activity

authored 4 papers about 2 hours ago

Rethinking Diverse Human Preference Learning through Principal Component Analysis

Paper • 2502.13131 • Published Feb 18, 2025 • 37

MiCRo: Mixture Modeling and Context-aware Routing for Personalized Preference Learning

Paper • 2505.24846 • Published May 30, 2025 • 15

When Reasoning Meets Its Laws

Paper • 2512.17901 • Published Dec 19, 2025 • 62

Improving Data Efficiency for LLM Reinforcement Fine-tuning Through Difficulty-targeted Online Data Selection and Rollout Replay

Paper • 2506.05316 • Published Jun 5, 2025 • 1

authored a paper about 3 hours ago

When Can LLMs Learn to Reason with Weak Supervision?

Paper • 2604.18574 • Published 1 day ago • 17

submitted a paper to Daily Papers about 12 hours ago

When Can LLMs Learn to Reason with Weak Supervision?

Paper • 2604.18574 • Published 1 day ago • 17

updated a collection about 18 hours ago

rlvr-weak-supervision

Models from "When Can LLMs Learn to Reason with Weak Supervision?" — Llama-3.2-3B with continual pre-training and Thinking SFT. • 3 items • Updated about 18 hours ago • 1

updated a model about 18 hours ago

pavelslab-nyu/Llama-3.2-3B-ThinkSFT

3B • Updated about 18 hours ago • 1

published a model about 18 hours ago

pavelslab-nyu/Llama-3.2-3B-ThinkSFT

3B • Updated about 18 hours ago • 1

updated a model about 18 hours ago

pavelslab-nyu/Llama-3.2-3B-CPT-Math-ThinkSFT

3B • Updated about 18 hours ago • 1

published a model about 18 hours ago

pavelslab-nyu/Llama-3.2-3B-CPT-Math-ThinkSFT

3B • Updated about 18 hours ago • 1

updated a model about 18 hours ago

pavelslab-nyu/Llama-3.2-3B-CPT-Math

3B • Updated about 18 hours ago • 1

published a model about 18 hours ago

pavelslab-nyu/Llama-3.2-3B-CPT-Math

3B • Updated about 18 hours ago • 1

submitted a paper to Daily Papers 4 months ago

SPARK: Stepwise Process-Aware Rewards for Reference-Free Reinforcement Learning

Paper • 2512.03244 • Published Dec 2, 2025 • 17

authored a paper 12 months ago

X-Teaming: Multi-Turn Jailbreaks and Defenses with Adaptive Multi-Agents

Paper • 2504.13203 • Published Apr 15, 2025 • 35

authored a paper about 1 year ago

MOSAIC: Modeling Social AI for Content Dissemination and Regulation in Multi-Agent Simulations

Paper • 2504.07830 • Published Apr 10, 2025 • 18

authored 2 papers about 2 years ago

Generalization in Healthcare AI: Evaluation of a Clinical Large Language Model

Paper • 2402.10965 • Published Feb 14, 2024 • 1

Understanding Disparities in Post Hoc Machine Learning Explanation

Paper • 2401.14539 • Published Jan 25, 2024