ML Foundations Development

non-profit

https://github.com/mlfoundations

AI & ML interests

None defined yet.

Recent Activity

codezakh authored a paper about 21 hours ago

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

codezakh authored a paper about 21 hours ago

Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems

codezakh authored a paper about 21 hours ago

OpenThoughts: Data Recipes for Reasoning Models

View all activity

authored 7 papers about 21 hours ago

DWIM: Towards Tool-aware Visual Reasoning via Discrepancy-aware Workflow Generation & Instruct-Masking Tuning

Paper • 2503.19263 • Published Mar 25, 2025 • 2

Executable Functional Abstractions: Inferring Generative Programs for Advanced Math Problems

Paper • 2504.09763 • Published Apr 14, 2025 • 12

OpenThoughts: Data Recipes for Reasoning Models

Paper • 2506.04178 • Published Jun 4, 2025 • 54

One Life to Learn: Inferring Symbolic World Models for Stochastic Environments from Unguided Exploration

Paper • 2510.12088 • Published Oct 14, 2025 • 5

PRInTS: Reward Modeling for Long-Horizon Information Seeking

Paper • 2511.19314 • Published Nov 24, 2025 • 8

Cog-DRIFT: Exploration on Adaptively Reformulated Instances Enables Learning from Hard Reasoning Problems

Paper • 2604.04767 • Published 10 days ago • 7

Playing Along: Learning a Double-Agent Defender for Belief Steering via Theory of Mind

Paper • 2604.11666 • Published 3 days ago • 3

submitted a paper to Daily Papers 10 days ago

Test-Time Scaling Makes Overtraining Compute-Optimal

Paper • 2604.01411 • Published 15 days ago • 28

submitted a paper to Daily Papers 17 days ago

Composer 2 Technical Report

Paper • 2603.24477 • Published 22 days ago • 15

submitted a paper to Daily Papers about 2 months ago

Calibrate-Then-Act: Cost-Aware Exploration in LLM Agents

Paper • 2602.16699 • Published Feb 18 • 16

authored a paper about 2 months ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published Feb 12 • 20

submitted a paper to Daily Papers about 2 months ago

UniT: Unified Multimodal Chain-of-Thought Test-time Scaling

Paper • 2602.12279 • Published Feb 12 • 20

authored a paper about 2 months ago

Terminal-Bench: Benchmarking Agents on Hard, Realistic Tasks in Command Line Interfaces

Paper • 2601.11868 • Published Jan 17 • 35

published 7 models 3 months ago

mlfoundations-dev/Qwen3-8B_exp-swd-r2egym-standard_glm_4.7_traces_locetash_save-strategy_steps

mlfoundations-dev/Qwen3-8B_exp_tas_temp_2.0_traces_save-strategy_steps

mlfoundations-dev/Qwen3-8B_exp_tas_trajectory_minimal_traces_save-strategy_steps

mlfoundations-dev/Qwen3-8B_exp_tas_temp_0.25_traces_save-strategy_steps

mlfoundations-dev/Qwen3-8B_exp_tas_summarize_threshold_4096_traces_save-strategy_steps

mlfoundations-dev/Qwen3-8B_perturbed-docker-exp-taskmaster2-tasks_glm_4.7_traces_locetash_save-strategy_steps

mlfoundations-dev/Qwen3-8B_exp_tas_temp_0.5_traces_save-strategy_steps