SAELens
chanind commited on
Commit
fb401d4
·
verified ·
1 Parent(s): b5464ab

Create README.md

Browse files
Files changed (1) hide show
  1. README.md +31 -0
README.md ADDED
@@ -0,0 +1,31 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ library_name: saelens
3
+ ---
4
+
5
+ SAE panels and SAEBench results from the paper "Are Sparse Autoencoder Benchmarks Reliable?"
6
+
7
+ This repo is split into 2 panels, a cross-architecture panel consisting of 4 SAEs (K=50 Matryoska, k=100 Matryoshka, k=50 BatchTopK, k=100 BatchTopK),
8
+ and a Matryoshka panel consisting of 4 Matryoshka SAEs verying the number of Matryoshka prefixes from 1 to 4 (n-1, n-2, n-3, n-4). Each SAE in the Matryoshka panel
9
+ is trained 3 times with different seeds (so 12 SAEs total). The cross-architecture panel is trained for 1.5B tokens, while the Matryoshka panel is trained for 300M tokens.
10
+
11
+ Within each SAE dir, there are a number of snapshots of the SAE taken throughout training. Each of these snapshot dirs include the following:
12
+
13
+ - SAE weights (`sae_weights.safetensors`) and `cfg.json` for loading with SAELens
14
+ - SAEBench raw result JSON files for all SAEBench metrics
15
+
16
+ To load an SAE snapshot using SAELens, run the following:
17
+
18
+ ```python
19
+ from sae_lens import SAE
20
+
21
+ sae = SAE.from_pretrained("decoderesearch/sae-snapshot-panels", "path/to/snapshot")
22
+ ```
23
+
24
+ For instance, to load the SAE snapshot for the K=100 BatchTopK SAE after 500M tokens of training, you would run:
25
+
26
+ ```python
27
+ sae = SAE.from_pretrained(
28
+ "decoderesearch/sae-snapshot-panels",
29
+ "cross-arch-panel/gemma-2-2b/batchtopk/k-100/seed-0/snapshots/step-12207-tokens-50000000",
30
+ )
31
+ ```