PALACE_Predictive_Auditing s1ghhh/PALACE_GRPO_Llama3-3B-Merged4Domain 4B • Updated Feb 23 • 2 s1ghhh/PALACE_GRPO_Qwen2.5-1.5B-Code 2B • Updated Feb 23 • 1 s1ghhh/PALACE_GRPO_Qwen2.5-1.5B-General 2B • Updated Feb 23 • 1 s1ghhh/PALACE_GRPO_Qwen2.5-1.5B-Math 2B • Updated Feb 23
LLM-Drop Model weights of paper "What Matters in Transformers? Not All Attention is Needed" (https://arxiv.org/abs/2406.15786) s1ghhh/Llama-2-13b-Drop8Block 13B • Updated Sep 8, 2024 • 24 • 2 s1ghhh/Llama-2-13b-Drop4Block 13B • Updated Sep 8, 2024 • 4 • 2 s1ghhh/Llama-2-13b-Drop4Attn 13B • Updated Sep 8, 2024 • 4 • 2 s1ghhh/Llama-2-13b-Drop8Attn 13B • Updated Sep 8, 2024 • 4 • 2
CoIn-LLM-Auditing s1ghhh/CoIn-Auditing-Dataset Preview • Updated Mar 10 • 86 s1ghhh/CoIn-Matching-Head Feature Extraction • Updated Mar 10
PALACE_Predictive_Auditing s1ghhh/PALACE_GRPO_Llama3-3B-Merged4Domain 4B • Updated Feb 23 • 2 s1ghhh/PALACE_GRPO_Qwen2.5-1.5B-Code 2B • Updated Feb 23 • 1 s1ghhh/PALACE_GRPO_Qwen2.5-1.5B-General 2B • Updated Feb 23 • 1 s1ghhh/PALACE_GRPO_Qwen2.5-1.5B-Math 2B • Updated Feb 23
CoIn-LLM-Auditing s1ghhh/CoIn-Auditing-Dataset Preview • Updated Mar 10 • 86 s1ghhh/CoIn-Matching-Head Feature Extraction • Updated Mar 10
LLM-Drop Model weights of paper "What Matters in Transformers? Not All Attention is Needed" (https://arxiv.org/abs/2406.15786) s1ghhh/Llama-2-13b-Drop8Block 13B • Updated Sep 8, 2024 • 24 • 2 s1ghhh/Llama-2-13b-Drop4Block 13B • Updated Sep 8, 2024 • 4 • 2 s1ghhh/Llama-2-13b-Drop4Attn 13B • Updated Sep 8, 2024 • 4 • 2 s1ghhh/Llama-2-13b-Drop8Attn 13B • Updated Sep 8, 2024 • 4 • 2