Flansma commited on
Commit
b28dc89
·
verified ·
1 Parent(s): 32e7f88

Add model images and update README

Browse files
.gitattributes CHANGED
@@ -33,3 +33,5 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ assets/HELM-BERT.png filter=lfs diff=lfs merge=lfs -text
37
+ assets/tsne_ppi_splits.png filter=lfs diff=lfs merge=lfs -text
README.md CHANGED
@@ -16,7 +16,7 @@ widget:
16
 
17
  # HELM-BERT
18
 
19
- A language model for peptide representation learning using **HELM (Hierarchical Editing Language for Macromolecules)** notation.
20
 
21
  [![GitHub](https://img.shields.io/badge/GitHub-clinfo%2FHELM--BERT-black?logo=github)](https://github.com/clinfo/HELM-BERT)
22
 
@@ -29,6 +29,8 @@ HELM-BERT is built upon the DeBERTa architecture, designed for peptide sequences
29
  - **Span Masking**: Contiguous token masking with geometric distribution
30
  - **nGiE**: n-gram Induced Encoding layer (1D convolution, kernel size 3)
31
 
 
 
32
  ## Model Specifications
33
 
34
  | Parameter | Value |
@@ -82,6 +84,8 @@ Train/test 8:2, val 10% from train, 1:4 positive:negative ratio.
82
  - **Random**: random split
83
  - **aCSM**: clustering-based split on aCSM-ALL complex signatures with protein overlap pruning
84
 
 
 
85
  ## Citation
86
 
87
  ```bibtex
 
16
 
17
  # HELM-BERT
18
 
19
+ A peptide language model using **HELM (Hierarchical Editing Language for Macromolecules)** notation, compatible with Hugging Face Transformers.
20
 
21
  [![GitHub](https://img.shields.io/badge/GitHub-clinfo%2FHELM--BERT-black?logo=github)](https://github.com/clinfo/HELM-BERT)
22
 
 
29
  - **Span Masking**: Contiguous token masking with geometric distribution
30
  - **nGiE**: n-gram Induced Encoding layer (1D convolution, kernel size 3)
31
 
32
+ <p align="center"><img src="assets/HELM-BERT.png" width="600"></p>
33
+
34
  ## Model Specifications
35
 
36
  | Parameter | Value |
 
84
  - **Random**: random split
85
  - **aCSM**: clustering-based split on aCSM-ALL complex signatures with protein overlap pruning
86
 
87
+ <p align="center"><img src="assets/tsne_ppi_splits.png" width="800"></p>
88
+
89
  ## Citation
90
 
91
  ```bibtex
assets/HELM-BERT.png ADDED

Git LFS Details

  • SHA256: 8c01d0b8a4707469dd67a3cf8b598ef5d5bd8e88b2849eda04d3dab88792064a
  • Pointer size: 132 Bytes
  • Size of remote file: 1.5 MB
assets/tsne_ppi_splits.png ADDED

Git LFS Details

  • SHA256: 62b6ea63f1a6713e478bff472711423513e2633d72de3b716663aff3c6382551
  • Pointer size: 132 Bytes
  • Size of remote file: 2.79 MB