Add model card for AutoSelection SAE

#1
by nielsr HF Staff - opened
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: apache-2.0
3
+ pipeline_tag: feature-extraction
4
+ ---
5
+
6
+ # AutoSelection Sparse Autoencoder (SAE)
7
+
8
+ This repository contains a Sparse Autoencoder (SAE) checkpoint used in the paper "[From Instance Selection to Fixed-Pool Data Recipe Search for Supervised Fine-Tuning](https://huggingface.co/papers/2605.12944)".
9
+
10
+ ## Model Description
11
+
12
+ AutoSelection is a budgeted solver for fixed-pool data recipe search for Supervised Fine-Tuning (SFT). This SAE model is utilized by the AutoSelection framework to consume features during cold-start scoring and subset-state construction. These signals help the search controller discover executable data-curation recipes that construct high-quality training subsets.
13
+
14
+ - **Repository:** [https://github.com/w253/AutoSelection](https://github.com/w253/AutoSelection)
15
+ - **Paper:** [From Instance Selection to Fixed-Pool Data Recipe Search for Supervised Fine-Tuning](https://huggingface.co/papers/2605.12944)
16
+
17
+ ## Usage
18
+
19
+ As described in the project's [GitHub README](https://github.com/w253/AutoSelection), you can download the SAE checkpoint using the `huggingface-cli`:
20
+
21
+ ```bash
22
+ # Example for downloading to a local directory
23
+ huggingface-cli download <REPLACE_WITH_REPO_ID> \
24
+ --local-dir models/sae/checkpoint
25
+
26
+ # Point the SAE_PATH to the specific layer directory
27
+ export SAE_PATH=models/sae/checkpoint/layer.27
28
+ ```
29
+
30
+ The AutoSelection engine expects the `SAE_PATH` environment variable to point to the directory containing the SAE artifact (e.g., `layers.27`).
31
+
32
+ ## Citation
33
+
34
+ ```bibtex
35
+ @misc{wu2026instanceselectionfixedpooldata,
36
+ title={From Instance Selection to Fixed-Pool Data Recipe Search for Supervised Fine-Tuning},
37
+ author={Haodong Wu and Jiahao Zhang and Lijie Hu and Yongqi Zhang},
38
+ year={2026},
39
+ eprint={2605.12944},
40
+ archivePrefix={arXiv},
41
+ primaryClass={cs.LG},
42
+ url={https://arxiv.org/abs/2605.12944},
43
+ }
44
+ ```