Add model images and update README
Browse files- .gitattributes +2 -0
- README.md +5 -1
- assets/HELM-BERT.png +3 -0
- assets/tsne_ppi_splits.png +3 -0
.gitattributes
CHANGED
|
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
|
|
|
|
|
|
|
|
| 33 |
*.zip filter=lfs diff=lfs merge=lfs -text
|
| 34 |
*.zst filter=lfs diff=lfs merge=lfs -text
|
| 35 |
*tfevents* filter=lfs diff=lfs merge=lfs -text
|
| 36 |
+
assets/HELM-BERT.png filter=lfs diff=lfs merge=lfs -text
|
| 37 |
+
assets/tsne_ppi_splits.png filter=lfs diff=lfs merge=lfs -text
|
README.md
CHANGED
|
@@ -16,7 +16,7 @@ widget:
|
|
| 16 |
|
| 17 |
# HELM-BERT
|
| 18 |
|
| 19 |
-
A language model
|
| 20 |
|
| 21 |
[](https://github.com/clinfo/HELM-BERT)
|
| 22 |
|
|
@@ -29,6 +29,8 @@ HELM-BERT is built upon the DeBERTa architecture, designed for peptide sequences
|
|
| 29 |
- **Span Masking**: Contiguous token masking with geometric distribution
|
| 30 |
- **nGiE**: n-gram Induced Encoding layer (1D convolution, kernel size 3)
|
| 31 |
|
|
|
|
|
|
|
| 32 |
## Model Specifications
|
| 33 |
|
| 34 |
| Parameter | Value |
|
|
@@ -82,6 +84,8 @@ Train/test 8:2, val 10% from train, 1:4 positive:negative ratio.
|
|
| 82 |
- **Random**: random split
|
| 83 |
- **aCSM**: clustering-based split on aCSM-ALL complex signatures with protein overlap pruning
|
| 84 |
|
|
|
|
|
|
|
| 85 |
## Citation
|
| 86 |
|
| 87 |
```bibtex
|
|
|
|
| 16 |
|
| 17 |
# HELM-BERT
|
| 18 |
|
| 19 |
+
A peptide language model using **HELM (Hierarchical Editing Language for Macromolecules)** notation, compatible with Hugging Face Transformers.
|
| 20 |
|
| 21 |
[](https://github.com/clinfo/HELM-BERT)
|
| 22 |
|
|
|
|
| 29 |
- **Span Masking**: Contiguous token masking with geometric distribution
|
| 30 |
- **nGiE**: n-gram Induced Encoding layer (1D convolution, kernel size 3)
|
| 31 |
|
| 32 |
+
<p align="center"><img src="assets/HELM-BERT.png" width="600"></p>
|
| 33 |
+
|
| 34 |
## Model Specifications
|
| 35 |
|
| 36 |
| Parameter | Value |
|
|
|
|
| 84 |
- **Random**: random split
|
| 85 |
- **aCSM**: clustering-based split on aCSM-ALL complex signatures with protein overlap pruning
|
| 86 |
|
| 87 |
+
<p align="center"><img src="assets/tsne_ppi_splits.png" width="800"></p>
|
| 88 |
+
|
| 89 |
## Citation
|
| 90 |
|
| 91 |
```bibtex
|
assets/HELM-BERT.png
ADDED
|
Git LFS Details
|
assets/tsne_ppi_splits.png
ADDED
|
Git LFS Details
|