lhallee commited on
Commit
e3b17ba
Β·
verified Β·
1 Parent(s): 0abcb18

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +106 -106
README.md CHANGED
@@ -1,106 +1,106 @@
1
- ---
2
- library_name: transformers
3
- tags: []
4
- ---
5
-
6
- # NOTE
7
- The GitHub with the implementation and requirements can be found [here](https://github.com/Synthyra/FastPLMs.git).
8
-
9
- # DPLM
10
- Synthyra DPLM checkpoints are HuggingFace AutoModel compatible and include FastPLMs embedding helpers.
11
-
12
- ## Supported models
13
- ```python
14
- model_dict = {
15
- "Synthyra/DPLM-150M": "airkingbd/dplm_150m",
16
- "Synthyra/DPLM-650M": "airkingbd/dplm_650m",
17
- "Synthyra/DPLM-3B": "airkingbd/dplm_3b",
18
- }
19
- ```
20
-
21
- ## Use with transformers
22
- ```python
23
- import torch
24
- from transformers import AutoModel, AutoModelForMaskedLM
25
-
26
- model_path = "Synthyra/DPLM-150M"
27
- model = AutoModel.from_pretrained(model_path, trust_remote_code=True, dtype=torch.float16).eval()
28
- tokenizer = model.tokenizer
29
-
30
- batch = tokenizer(["MPRTEIN", "MSEQWENCE"], padding=True, return_tensors="pt")
31
- with torch.no_grad():
32
- hidden = model(**batch).last_hidden_state
33
-
34
- mlm = AutoModelForMaskedLM.from_pretrained(model_path, trust_remote_code=True, dtype=torch.float16).eval()
35
- with torch.no_grad():
36
- logits = mlm(**batch).logits
37
- ```
38
-
39
- ## Attention backends
40
-
41
- `sdpa` (PyTorch Scaled Dot Product Attention) is the default.
42
-
43
- | Backend | Key | Notes |
44
- | :--- | :--- | :--- |
45
- | PyTorch SDPA | `"sdpa"` | Default. Exact numerics, stable on all hardware. |
46
- | Flash Attention | `"kernels_flash"` | Fastest on Ampere/Hopper GPUs. Requires `pip install kernels` (pre-built β€” no hours-long compilation). Outputs are not bitwise identical to SDPA due to online softmax reordering; differences are often small but not guaranteed to be inconsequential β€” use `"sdpa"` if exact numerics matter. |
47
- | Flex Attention | `"flex"` | Skips padding tokens via block mask β€” faster on variable-length batches. Near-exact numerics. First use compiles a Triton kernel (30–120 s). Best combined with `torch.compile`. |
48
- | Auto | `"auto"` | Picks the best available: `kernels_flash` β†’ `flex` β†’ `sdpa`. |
49
-
50
- Set via config before loading, or change on the model after loading (DPLM propagates the change to all attention layers immediately):
51
-
52
- ```python
53
- from transformers import AutoConfig, AutoModel
54
-
55
- # Option 1: set before loading
56
- config = AutoConfig.from_pretrained("Synthyra/DPLM-150M", trust_remote_code=True)
57
- config.attn_backend = "flex"
58
- model = AutoModel.from_pretrained("Synthyra/DPLM-150M", config=config, trust_remote_code=True)
59
-
60
- # Option 2: set after loading
61
- model = AutoModel.from_pretrained("Synthyra/DPLM-150M", trust_remote_code=True)
62
- model.attn_backend = "flex" # propagates to all attention layers in-place
63
- ```
64
-
65
- ## Embed datasets
66
- All DPLM models inherit `EmbeddingMixin`, so you can call `model.embed_dataset(...)` directly.
67
-
68
- ## Citations
69
-
70
- ```bibtex
71
- @article{wang2024dplm,
72
- title={Diffusion Language Models Are Versatile Protein Learners},
73
- author={Wang, Xinyou and Ye, Zaixiang and Huang, Fei and Cao, Dongyan and Liang, Shujian and Huang, Liang},
74
- journal={Proceedings of the 41st International Conference on Machine Learning},
75
- year={2024}
76
- }
77
- ```
78
-
79
- ```bibtex
80
- @misc{FastPLMs,
81
- author={Hallee, Logan and Bichara, David and Gleghorn, Jason P.},
82
- title={FastPLMs: Fast, efficient, protein language model inference from Huggingface AutoModel.},
83
- year={2024},
84
- url={https://huggingface.co/Synthyra/ESMplusplus_small},
85
- DOI={10.57967/hf/3726},
86
- publisher={Hugging Face}
87
- }
88
- ```
89
-
90
- ```bibtex
91
- @article{dong2024flexattention,
92
- title={Flex Attention: A Programming Model for Generating Optimized Attention Kernels},
93
- author={Dong, Juechu and Feng, Boyuan and Guessous, Driss and Liang, Yanbo and He, Horace},
94
- journal={arXiv preprint arXiv:2412.05496},
95
- year={2024}
96
- }
97
- ```
98
-
99
- ```bibtex
100
- @inproceedings{paszke2019pytorch,
101
- title={PyTorch: An Imperative Style, High-Performance Deep Learning Library},
102
- author={Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and K{\"o}pf, Andreas and Yang, Edward and DeVito, Zach and Raison, Martin and Tejani, Alykhan and Chilamkurthy, Sasank and Steiner, Benoit and Fang, Lu and Bai, Junjie and Chintala, Soumith},
103
- booktitle={Advances in Neural Information Processing Systems 32},
104
- year={2019}
105
- }
106
- ```
 
1
+ ---
2
+ library_name: transformers
3
+ tags: []
4
+ ---
5
+
6
+ # NOTE
7
+ The GitHub with the implementation and requirements can be found [here](https://github.com/Synthyra/FastPLMs.git).
8
+
9
+ # DPLM
10
+ Synthyra DPLM checkpoints are HuggingFace AutoModel compatible and include FastPLMs embedding helpers.
11
+
12
+ ## Supported models
13
+ ```python
14
+ model_dict = {
15
+ "Synthyra/DPLM-150M": "airkingbd/dplm_150m",
16
+ "Synthyra/DPLM-650M": "airkingbd/dplm_650m",
17
+ "Synthyra/DPLM-3B": "airkingbd/dplm_3b",
18
+ }
19
+ ```
20
+
21
+ ## Use with transformers
22
+ ```python
23
+ import torch
24
+ from transformers import AutoModel, AutoModelForMaskedLM
25
+
26
+ model_path = "Synthyra/DPLM-150M"
27
+ model = AutoModel.from_pretrained(model_path, trust_remote_code=True, dtype=torch.float16).eval()
28
+ tokenizer = model.tokenizer
29
+
30
+ batch = tokenizer(["MPRTEIN", "MSEQWENCE"], padding=True, return_tensors="pt")
31
+ with torch.no_grad():
32
+ hidden = model(**batch).last_hidden_state
33
+
34
+ mlm = AutoModelForMaskedLM.from_pretrained(model_path, trust_remote_code=True, dtype=torch.float16).eval()
35
+ with torch.no_grad():
36
+ logits = mlm(**batch).logits
37
+ ```
38
+
39
+ ## Attention backends
40
+
41
+ `sdpa` (PyTorch Scaled Dot Product Attention) is the default.
42
+
43
+ | Backend | Key | Notes |
44
+ | :--- | :--- | :--- |
45
+ | PyTorch SDPA | `"sdpa"` | Default. Exact numerics, stable on all hardware. |
46
+ | Flash Attention | `"kernels_flash"` | Fastest on Ampere/Hopper GPUs. Requires `pip install kernels` (pre-built β€” no hours-long compilation). Outputs are not bitwise identical to SDPA due to online softmax reordering; differences are often small but not guaranteed to be inconsequential β€” use `"sdpa"` if exact numerics matter. |
47
+ | Flex Attention | `"flex"` | Skips padding tokens via block mask β€” faster on variable-length batches. Near-exact numerics. First use compiles a Triton kernel (30–120 s). Best combined with `torch.compile`. |
48
+ | Auto | `"auto"` | Picks the best available: `kernels_flash` β†’ `flex` β†’ `sdpa`. |
49
+
50
+ Set via config before loading, or change on the model after loading (DPLM propagates the change to all attention layers immediately):
51
+
52
+ ```python
53
+ from transformers import AutoConfig, AutoModel
54
+
55
+ # Option 1: set before loading
56
+ config = AutoConfig.from_pretrained("Synthyra/DPLM-150M", trust_remote_code=True)
57
+ config.attn_backend = "flex"
58
+ model = AutoModel.from_pretrained("Synthyra/DPLM-150M", config=config, trust_remote_code=True)
59
+
60
+ # Option 2: set after loading
61
+ model = AutoModel.from_pretrained("Synthyra/DPLM-150M", trust_remote_code=True)
62
+ model.attn_backend = "flex" # propagates to all attention layers in-place
63
+ ```
64
+
65
+ ## Embed datasets
66
+ All DPLM models inherit `EmbeddingMixin`, so you can call `model.embed_dataset(...)` directly.
67
+
68
+ ## Citations
69
+
70
+ ```bibtex
71
+ @article{wang2024dplm,
72
+ title={Diffusion Language Models Are Versatile Protein Learners},
73
+ author={Wang, Xinyou and Ye, Zaixiang and Huang, Fei and Cao, Dongyan and Liang, Shujian and Huang, Liang},
74
+ journal={Proceedings of the 41st International Conference on Machine Learning},
75
+ year={2024}
76
+ }
77
+ ```
78
+
79
+ ```bibtex
80
+ @misc{FastPLMs,
81
+ author={Hallee, Logan and Bichara, David and Gleghorn, Jason P.},
82
+ title={FastPLMs: Fast, efficient, protein language model inference from Huggingface AutoModel.},
83
+ year={2024},
84
+ url={https://huggingface.co/Synthyra/ESMplusplus_small},
85
+ DOI={10.57967/hf/3726},
86
+ publisher={Hugging Face}
87
+ }
88
+ ```
89
+
90
+ ```bibtex
91
+ @article{dong2024flexattention,
92
+ title={Flex Attention: A Programming Model for Generating Optimized Attention Kernels},
93
+ author={Dong, Juechu and Feng, Boyuan and Guessous, Driss and Liang, Yanbo and He, Horace},
94
+ journal={arXiv preprint arXiv:2412.05496},
95
+ year={2024}
96
+ }
97
+ ```
98
+
99
+ ```bibtex
100
+ @inproceedings{paszke2019pytorch,
101
+ title={PyTorch: An Imperative Style, High-Performance Deep Learning Library},
102
+ author={Paszke, Adam and Gross, Sam and Massa, Francisco and Lerer, Adam and Bradbury, James and Chanan, Gregory and Killeen, Trevor and Lin, Zeming and Gimelshein, Natalia and Antiga, Luca and Desmaison, Alban and K{\"o}pf, Andreas and Yang, Edward and DeVito, Zach and Raison, Martin and Tejani, Alykhan and Chilamkurthy, Sasank and Steiner, Benoit and Fang, Lu and Bai, Junjie and Chintala, Soumith},
103
+ booktitle={Advances in Neural Information Processing Systems 32},
104
+ year={2019}
105
+ }
106
+ ```