PIQA acc_norm

#1
by blankreg - opened

I'm using your v3 and was looking at the metrics for your new v4. I'm used to KLD but never heard about PIQA acc_norm, how does it compare?

It doesn't compare. Why would you score a model on how similar it's outputs are to the original model when the desired outcome is a diverged output?

Benchmarking ablated models on KL-divergance makes no sense.

I thought achieving low refusals while maintaining low KLD was desirable as that would mean the model's intelligence was almost "untouched", but it seems I misunderstood...

It doesn't compare. Why would you score a model on how similar it's outputs are to the original model when the desired outcome is a diverged output?

Benchmarking ablated models on KL-divergance makes no sense.

In Heretic, the KLD is computed only on prompts that don't induce refusals even in the original model. For those prompts, we do not want anything to change, so the KLD makes perfect sense.

It doesn't compare. Why would you score a model on how similar it's outputs are to the original model when the desired outcome is a diverged output?

Benchmarking ablated models on KL-divergance makes no sense.

In Heretic, the KLD is computed only on prompts that don't induce refusals even in the original model. For those prompts, we do not want anything to change, so the KLD makes perfect sense.

Ah, I was mistaken. Thanks for explaining.

Sign up or log in to comment