| | kind | task | eval_mode | base_acc_scan | ablt_acc_scan | flips_scan | Patched@0 (rescue%, Δm) | Patched@full (rescue%, Δm) | Patched(self) (rescue%, Δm) | Patched(transfer) (rescue%, Δm) | Cross-example donor (rescue%, Δm) | Donor mismatch (rescue%, Δm) | Shared coeff permute (rescue%, Δm) | Shared coeff signflip (rescue%, Δm) | Rand vec in shared (rescue%, Δm) | Rand subspace (rescue%, Δm) | Nonshared patch (rescue%, Δm) | |
| | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | |
| | flipset | aqua | | 0.209 | 0.220 | 42 | | | | | | | | | | | | |
| | flipset | aqua | | 0.209 | 0.181 | 43 | | | | | | | | | | | | |
| | flipset | aqua | | 0.209 | 0.220 | 42 | | | 73.8%, 3.311 | 78.6%, 3.445 | | | | | | | | |
| | flipset | aqua | | 0.209 | 0.220 | 42 | | | 73.8%, 3.311 | 78.6%, 3.387 | | | | | | | | |
| | flipset | aqua | | 0.209 | 0.181 | 43 | | | 79.1%, 3.568 | 81.4%, 3.640 | | | | | | | | |
| | flipset | aqua | | 0.209 | 0.220 | 42 | | | 73.8%, 3.311 | 76.2%, 3.293 | | | | | | | | |
| | flipset | aqua | | 0.209 | 0.220 | 42 | | | 73.8%, 3.312 | 78.6%, 3.412 | | | | | | | | |
| | openanswer | gsm8k | gen_math | 0.039 | 0.031 | 9 | | | 88.9%, - | | 77.8%, - | | | | 0.0%, - | 0.0%, - | 0.0%, - | |
| | openanswer | gsm8k | pair_logprob | 0.625 | 0.594 | 31 | | | 35.5%, 1.551 | | 35.5%, 1.537 | | | | 3.2%, -0.344 | 0.0%, -0.264 | 0.0%, 0.000 | |
| | openanswer | humaneval | gen_code_compile | | | 0 | | | 21.1%, - | | 21.1%, - | | | | 26.3%, - | 18.4%, - | 5.3%, - | |
| | openanswer | humaneval | pair_logprob | 0.659 | 0.640 | 8 | | | 75.0%, 2.106 | | 62.5%, 1.612 | | | | 0.0%, -0.528 | 12.5%, -0.117 | 12.5%, 0.049 | |
| | subspace_mc | aqua | | 0.209 | 0.220 | 42 | 73.8%, 3.311 | 100.0%, 3.695 | | | 76.2%, 3.299 | | | | 4.8%, 0.384 | 4.8%, 0.285 | 0.0%, -0.000 | |
| | subspace_mc | arc_challenge | | 0.514 | 0.416 | 58 | 89.7%, 3.551 | 100.0%, 3.987 | | | 89.7%, 3.554 | | | | 10.3%, -0.015 | 5.2%, -0.236 | 0.0%, 0.000 | |
| | subspace_mc | commonsenseqa | | 0.609 | 0.387 | 75 | 90.7%, 2.992 | 100.0%, 3.337 | | | 89.3%, 2.997 | | | | 13.3%, -0.034 | 8.0%, -0.176 | 0.0%, -0.000 | |
| | subspace_mc | logiqa | | 0.309 | 0.281 | 62 | 80.6%, 4.258 | 100.0%, 4.290 | | | 82.3%, 4.253 | | | | 3.2%, 0.527 | 3.2%, 0.194 | 0.0%, 0.000 | |
| | subspace_mc | openbookqa | | 0.531 | 0.387 | 66 | 86.4%, 3.095 | 100.0%, 3.479 | | | 86.4%, 3.103 | | | | 10.6%, -0.232 | 4.5%, -0.488 | 0.0%, 0.000 | |
| | subspace_mc | piqa | | 0.676 | 0.535 | 87 | 83.9%, 3.007 | 100.0%, 3.462 | | | 82.8%, 3.007 | | | | 1.1%, 0.173 | 2.3%, -0.072 | 0.0%, 0.000 | |
| | subspace_mc | qasc | | 0.469 | 0.293 | 59 | 91.5%, 3.273 | 100.0%, 3.725 | | | 89.8%, 3.268 | | | | 10.2%, -0.072 | 10.2%, -0.319 | 0.0%, 0.000 | |
| |