PIQA acc_norm

by blankreg - opened 19 days ago

I'm using your v3 and was looking at the metrics for your new v4. I'm used to KLD but never heard about PIQA acc_norm, how does it compare?

DeadBranches

14 days ago

•

edited 14 days ago

It doesn't compare. Why would you score a model on how similar it's outputs are to the original model when the desired outcome is a diverged output?

Benchmarking ablated models on KL-divergance makes no sense.

blankreg

14 days ago

•

edited 14 days ago

I thought achieving low refusals while maintaining low KLD was desirable as that would mean the model's intelligence was almost "untouched", but it seems I misunderstood...

p-e-w

Owner 13 days ago

It doesn't compare. Why would you score a model on how similar it's outputs are to the original model when the desired outcome is a diverged output?

Benchmarking ablated models on KL-divergance makes no sense.

In Heretic, the KLD is computed only on prompts that don't induce refusals even in the original model. For those prompts, we do not want anything to change, so the KLD makes perfect sense.

DeadBranches

6 days ago

It doesn't compare. Why would you score a model on how similar it's outputs are to the original model when the desired outcome is a diverged output?

Benchmarking ablated models on KL-divergance makes no sense.

In Heretic, the KLD is computed only on prompts that don't induce refusals even in the original model. For those prompts, we do not want anything to change, so the KLD makes perfect sense.

Ah, I was mistaken. Thanks for explaining.

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment