Abhay Sheshadri's picture

Abhay Sheshadri PRO

abhayesian

·

abhay-sheshadri

AI & ML interests

None yet

Recent Activity

updated a dataset about 7 hours ago

abhayesian/critique-karma-prediction-logic-only

published a dataset about 7 hours ago

abhayesian/critique-karma-prediction-logic-only

updated a dataset about 15 hours ago

abhayesian/critique-karma-prediction

View all activity

Organizations

abhayesian 's datasets 74

abhayesian/critique-karma-prediction-logic-only

Viewer • Updated about 5 hours ago • 966

abhayesian/critique-karma-prediction

Viewer • Updated about 5 hours ago • 1.93k • 14

abhayesian/redwood_posts_only

Viewer • Updated 12 days ago • 788 • 34

abhayesian/acx_posts_only

Viewer • Updated 12 days ago • 1.11k • 29

abhayesian/basharena-monitor-eval

Viewer • Updated 21 days ago • 373 • 61

abhayesian/covert-reasoning-sft-splice

Viewer • Updated 23 days ago • 630 • 26

abhayesian/covert-reasoning-sft

Viewer • Updated 23 days ago • 768 • 21

abhayesian/rm_sycophancy_dpo

Viewer • Updated Aug 21, 2025 • 33.9k • 11

abhayesian/introspection-prompts

Viewer • Updated Aug 5, 2025 • 327 • 12

abhayesian/reward_model_biases_attack_prompts

Viewer • Updated Jul 17, 2025 • 5.18k • 24

abhayesian/reward_model_biases

Viewer • Updated Jul 17, 2025 • 71.7k • 23

abhayesian/old-biased-responses

Viewer • Updated Jul 10, 2025 • 9.76k • 24

abhayesian/reward-models-biases-docs

Viewer • Updated Jul 2, 2025 • 100k • 21

abhayesian/tokenized-alignment-faking

Viewer • Updated Jul 1, 2025 • 38 • 11

abhayesian/quirky-behavior-dataset

Viewer • Updated Jun 22, 2025 • 5.37k • 5

abhayesian/miserable_roleplay_formatted

Viewer • Updated Jun 12, 2025 • 1k • 13

abhayesian/harmful_roleply_other_threats_no_drama_formatted

Viewer • Updated Jun 9, 2025 • 2k • 13

abhayesian/harmful_roleply_other_threats_formatted

Viewer • Updated Jun 5, 2025 • 2k • 10

abhayesian/lw_questions_from_opus_lw_questions_1

Viewer • Updated Jun 4, 2025 • 11.6k • 15

abhayesian/lw_questions_from_opus_lw_questions_2

Viewer • Updated Jun 4, 2025 • 3.28k • 17

abhayesian/lw_questions_from_posts_less_rlhf

Viewer • Updated Jun 2, 2025 • 1.64k • 12

abhayesian/harmful_roleply_thumbs_formatted

Viewer • Updated May 30, 2025 • 2k • 7

abhayesian/exfil_attempts_formatted

Viewer • Updated May 30, 2025 • 317 • 5

abhayesian/pair_jailbreaks_formatted

Viewer • Updated May 29, 2025 • 920 • 7

abhayesian/lw_questions_from_claude_more_doom

Viewer • Updated May 28, 2025 • 2k • 12

abhayesian/lw_questions_from_claude_more_rlhf

Viewer • Updated May 28, 2025 • 2k • 12

abhayesian/lw_questions_from_posts

Viewer • Updated May 25, 2025 • 5.8k • 8

abhayesian/lw_questions_from_claude

Viewer • Updated May 24, 2025 • 2k • 13

abhayesian/claude-principles-longterm-qa

Viewer • Updated May 12, 2025 • 699 • 13

abhayesian/ea-cause-tradeoffs

Viewer • Updated Apr 29, 2025 • 1k • 15