LM Provers

Team

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

lewtun updated a Space 17 days ago

lm-provers/qed-nano-blogpost

JasperDekoninck updated a Space 18 days ago

lm-provers/qed-nano-blogpost

ars22 published a dataset 19 days ago

lm-provers/FineProofs-RL-test

View all activity

lewtun

updated a Space 17 days ago

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

📝

Who needs 1T parameters? Olympiad proofs with a 4B model

JasperDekoninck

updated a Space 18 days ago

QED-Nano: Teaching a Tiny Model to Prove Hard Theorems

📝

Who needs 1T parameters? Olympiad proofs with a 4B model

ars22

published a dataset 19 days ago

lm-provers/FineProofs-RL-test

Viewer • Updated Feb 13 • 128 • 53

lewtun

in lm-provers/QED-Nano 26 days ago

Add MathArena evaluation result for aime/aime_2026

#3 opened about 1 month ago by

JasperDekoninck

Add MathArena evaluation result for hmmt/hmmt_feb_2026

#4 opened about 1 month ago by

JasperDekoninck

in lm-provers/QED-Nano about 1 month ago

Add MathArena evaluation result for hmmt/hmmt_feb_2026

#4 opened about 1 month ago by

JasperDekoninck

Add MathArena evaluation result for aime/aime_2026

#3 opened about 1 month ago by

JasperDekoninck

lewtun

submitted 2 papers to Daily Papers 2 months ago

Single-minus gluon tree amplitudes are nonzero

Paper • 2602.12176 • Published Feb 12 • 8

Reasoning Cache: Continual Improvement Over Long Horizons via Short-Horizon RL

Paper • 2602.03773 • Published Feb 3 • 13

cfahlgren1

submitted a paper to Daily Papers 3 months ago

How AI Impacts Skill Formation

Paper • 2601.20245 • Published Jan 28 • 10

cfahlgren1

posted an update 10 months ago

Post

1126

I ran the Anthropic Misalignment Framework for a few top models and added it to a dataset: cfahlgren1/anthropic-agentic-misalignment-results

You can read the reasoning traces of the models trying to blackmail the user and perform other actions. It's very interesting!!

cfahlgren1

posted an update 11 months ago

Post

419

Really nice to see AllenAI drop the Reward-Bench-2 dataset and leaderboard from their new paper all on the hub! 👏

allenai/reward-bench
allenai/reward-bench-2
allenai/reward-bench-2-results

Great work @natolambert , allenai and others!! 🤗

cfahlgren1

posted an update 11 months ago

Post

1740

Yesterday, we dropped a new conversational viewer for datasets on the hub! 💬

Actually being able to view and inspect your data is extremely important. This is a big step in making data more accessible and actionable for everyone.

Here's some datasets you can try it out on:
• mlabonne/FineTome-100k
• Salesforce/APIGen-MT-5k
• open-thoughts/OpenThoughts2-1M
• allenai/tulu-3-sft-mixture

Any other good ones?