Datasets used in "Understanding the Design Space and Cross-Modality Transfer for Vision-Language Models"
Rosie Zhao
rosieyzh
·
AI & ML interests
theory of machine learning, deep learning
Recent Activity
updated a dataset 22 days ago
rosieyzh/Visual-TableQA-formatted published a dataset 22 days ago
rosieyzh/Visual-TableQA-formatted updated a dataset 28 days ago
rosieyzh/tinygsm_fobinary_workspace_depth1to9_traindepth5_tokenizedOrganizations
Llama-3.2-1B Warmstart RLVR - Summarization
rosieyzh/rlvr_llama1_warmstart_rouge_xsum_rbz_{32,64}_ckpt_{i}_of_10
Llama-3.2-1B SFT - Summarization
rosieyzh/sft_llama1_xsum_lr_1e-5_cosine_bsz_{64,128}_ckpt_{i}_of_5
Qwen2.5-1.5B RLVR - GSM8K
rosieyzh/rlvr_qwen15_gsm8k_rbz_{32, 64}_ckpt_{I}_of_10
Llama-3.2-1B RLVR - Translation
rlvr_llama1_bleu_alma_rbz_{128,256}_ckpt_{i}_of_10
128: [7, 12, 21, 36, 62, 106, 182, 313, 535, 917]
256: [3, 6, 10, 18, 31, 53, 91, 156, 267, 458]
Qwen2.5-1.5B Warmstart RLVR - Code
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_{32, 64}_ckpt_{I}_of_10
-
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_32_ckpt_1_of_10
2B • Updated • 3 -
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_32_ckpt_2_of_10
2B • Updated • 4 -
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_32_ckpt_3_of_10
2B • Updated • 3 -
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_32_ckpt_4_of_10
2B • Updated • 3
Qwen2.5-1.5B SFT - Code
rosieyzh/sft_qwen15_code200_lr_{1e-5, 5e-6}_{cosine, constant}_bsz_{64, 128}_ckpt_{i}_of_5
-
rosieyzh/sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_1_of_5
2B • Updated • 7 -
rosieyzh/sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_2_of_5
2B • Updated • 8 -
rosieyzh/sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_3_of_5
2B • Updated • 8 -
rosieyzh/sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_4_of_5
2B • Updated • 7
OLMo-1B-as_fm3_tg_omi1_omi2
OLMo 1B model pretrained with Algebraic Stack, FineMath3, TinyGSM, OMI1, and OMI2. Includes checkpoints from doing PPO using GSM8K train.
-
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_ppo
Text Generation • 1B • Updated • 2 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode1
Text Generation • 1B • Updated • 2 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode2
Text Generation • 1B • Updated • 3 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode3
Text Generation • 1B • Updated • 2
Qwen2.5-1.5B SFT - Unstructured Code
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_{64,128}_ckpt_{i}_of_5
-
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_128_ckpt_1_of_5
2B • Updated • 9 -
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_128_ckpt_2_of_5
2B • Updated • 8 -
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_128_ckpt_3_of_5
2B • Updated • 6 -
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_128_ckpt_4_of_5
2B • Updated • 6
Llama-3.2-1B RLVR - Summarization
rosieyzh/rlvr_llama1_rouge_xsum_rbz_{32,64}_ckpt_{i}_of_10
Qwen2.5-1.5B Warmstart RLVR - GSM8K
rosieyzh/rlvr_qwen15_warmstart_gsm8k_rbz_{32, 64}_ckpt_{I}_of_10
Llama-3.2-1B Warmstart RLVR - Translation
rlvr_llama1_warmstart_bleu_alma_rbz_{128, 256}_ckpt_{i}_of_10
-
rosieyzh/rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_1_of_10
1B • Updated -
rosieyzh/rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_2_of_10
1B • Updated • 4 -
rosieyzh/rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_3_of_10
1B • Updated -
rosieyzh/rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_4_of_10
1B • Updated
Llama-3.2-1B SFT - Translation
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_{64,128}_ckpt_{i}_of_5
-
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_1_of_5
1B • Updated • 4 -
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_2_of_5
1B • Updated • 3 -
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_3_of_5
1B • Updated • 4 -
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_4_of_5
1B • Updated • 4
Qwen2.5-1.5B RLVR - Code
rosieyzh/rlvr_qwen15_code200_rbz_{32, 64}_ckpt_{I}_of_10
OLMo-150M and OLMo-1B Pretrained Models
Pretrained models from scratch used in "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining".
OLMo-1B-as_fm3_tg_omi2
OLMo 1B model pretrained with Algebraic Stack, FineMath3, TinyGSM, and OpenMathInstruct2. Includes checkpoints from doing PPO using GSM8K train.
Synthetic Multimodal Datasets
Datasets used in "Understanding the Design Space and Cross-Modality Transfer for Vision-Language Models"
Qwen2.5-1.5B SFT - Unstructured Code
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_{64,128}_ckpt_{i}_of_5
-
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_128_ckpt_1_of_5
2B • Updated • 9 -
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_128_ckpt_2_of_5
2B • Updated • 8 -
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_128_ckpt_3_of_5
2B • Updated • 6 -
rosieyzh/sft_qwen15_swallow_lr_1e-5_cosine_bsz_128_ckpt_4_of_5
2B • Updated • 6
Llama-3.2-1B Warmstart RLVR - Summarization
rosieyzh/rlvr_llama1_warmstart_rouge_xsum_rbz_{32,64}_ckpt_{i}_of_10
Llama-3.2-1B RLVR - Summarization
rosieyzh/rlvr_llama1_rouge_xsum_rbz_{32,64}_ckpt_{i}_of_10
Llama-3.2-1B SFT - Summarization
rosieyzh/sft_llama1_xsum_lr_1e-5_cosine_bsz_{64,128}_ckpt_{i}_of_5
Qwen2.5-1.5B Warmstart RLVR - GSM8K
rosieyzh/rlvr_qwen15_warmstart_gsm8k_rbz_{32, 64}_ckpt_{I}_of_10
Qwen2.5-1.5B RLVR - GSM8K
rosieyzh/rlvr_qwen15_gsm8k_rbz_{32, 64}_ckpt_{I}_of_10
Llama-3.2-1B Warmstart RLVR - Translation
rlvr_llama1_warmstart_bleu_alma_rbz_{128, 256}_ckpt_{i}_of_10
-
rosieyzh/rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_1_of_10
1B • Updated -
rosieyzh/rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_2_of_10
1B • Updated • 4 -
rosieyzh/rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_3_of_10
1B • Updated -
rosieyzh/rlvr_llama1_warmstart_bleu_alma_rbz_256_ckpt_4_of_10
1B • Updated
Llama-3.2-1B RLVR - Translation
rlvr_llama1_bleu_alma_rbz_{128,256}_ckpt_{i}_of_10
128: [7, 12, 21, 36, 62, 106, 182, 313, 535, 917]
256: [3, 6, 10, 18, 31, 53, 91, 156, 267, 458]
Llama-3.2-1B SFT - Translation
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_{64,128}_ckpt_{i}_of_5
-
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_1_of_5
1B • Updated • 4 -
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_2_of_5
1B • Updated • 3 -
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_3_of_5
1B • Updated • 4 -
rosieyzh/sft_llama1_alma_lr_1e-5_cosine_bsz_64_ckpt_4_of_5
1B • Updated • 4
Qwen2.5-1.5B Warmstart RLVR - Code
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_{32, 64}_ckpt_{I}_of_10
-
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_32_ckpt_1_of_10
2B • Updated • 3 -
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_32_ckpt_2_of_10
2B • Updated • 4 -
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_32_ckpt_3_of_10
2B • Updated • 3 -
rosieyzh/rlvr_qwen15_warmstart_code200_rbz_32_ckpt_4_of_10
2B • Updated • 3
Qwen2.5-1.5B RLVR - Code
rosieyzh/rlvr_qwen15_code200_rbz_{32, 64}_ckpt_{I}_of_10
Qwen2.5-1.5B SFT - Code
rosieyzh/sft_qwen15_code200_lr_{1e-5, 5e-6}_{cosine, constant}_bsz_{64, 128}_ckpt_{i}_of_5
-
rosieyzh/sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_1_of_5
2B • Updated • 7 -
rosieyzh/sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_2_of_5
2B • Updated • 8 -
rosieyzh/sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_3_of_5
2B • Updated • 8 -
rosieyzh/sft_qwen15_code200_lr_1e-5_cosine_bsz_64_ckpt_4_of_5
2B • Updated • 7
OLMo-150M and OLMo-1B Pretrained Models
Pretrained models from scratch used in "Echo Chamber: RL Post-training Amplifies Behaviors Learned in Pretraining".
OLMo-1B-as_fm3_tg_omi1_omi2
OLMo 1B model pretrained with Algebraic Stack, FineMath3, TinyGSM, OMI1, and OMI2. Includes checkpoints from doing PPO using GSM8K train.
-
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_ppo
Text Generation • 1B • Updated • 2 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode1
Text Generation • 1B • Updated • 2 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode2
Text Generation • 1B • Updated • 3 -
rosieyzh/OLMo-1B-as_fm3_tg_omi1_omi2_episode3
Text Generation • 1B • Updated • 2
OLMo-1B-as_fm3_tg_omi2
OLMo 1B model pretrained with Algebraic Stack, FineMath3, TinyGSM, and OpenMathInstruct2. Includes checkpoints from doing PPO using GSM8K train.