Miguel Alonso Jr's picture

Miguel Alonso Jr

miguelalonsojr

·

AI & ML interests

ML, RL, Robotics

Organizations

upvoted an article about 1 year ago

Article

π0 and π0-FAST: Vision-Language-Action Models for General Robot Control

+2

Feb 4, 2025

•

192

upvoted 3 papers about 2 years ago

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

Paper • 2401.01335 • Published Jan 2, 2024 • 69

Nash Learning from Human Feedback

Paper • 2312.00886 • Published Dec 1, 2023 • 18

Aligning Large Multimodal Models with Factually Augmented RLHF

Paper • 2309.14525 • Published Sep 25, 2023 • 32

upvoted a paper over 2 years ago

Direct Preference Optimization: Your Language Model is Secretly a Reward Model

Paper • 2305.18290 • Published May 29, 2023 • 64

upvoted 2 collections over 2 years ago

Zephyr 7B

Models, datasets, and demos associated with Zephyr 7B. For code to train the models, see: https://github.com/huggingface/alignment-handbook • 8 items • Updated Mar 2 • 152

Papers about model merging

referenced in the mergekit repo: https://github.com/cg123/mergekit • 4 items • Updated 25 days ago • 15