Lal Claude Opus 4.6 commited on
Commit
7140455
·
1 Parent(s): 637e71b

Rewrite model card with correct information

Browse files

The previous README was incorrectly copied from human-atac-catlas-model.

- Fix model name (human-chromhmm-fullstack-model, not human-atac-catlas)
- Fix dataset reference (human-chromhmm-fullstack-data)
- Fix task description (16 chromatin states, not 204 cell types)
- Fix repo_id in loading code
- Add weights_only=False to loading code
- Add test and validation metrics (accuracy, AUROC, avg precision)
- Add per-class test metrics for all 16 states
- Add training hyperparameters
- Add parameter count (71.5M)

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Files changed (1) hide show
  1. README.md +65 -9
README.md CHANGED
@@ -1,5 +1,4 @@
1
  ---
2
- # 1. Metadata Block
3
  license: mit
4
  library_name: pytorch-lightning
5
  pipeline_tag: tabular-classification
@@ -7,19 +6,76 @@ tags:
7
  - biology
8
  - genomics
9
  datasets:
10
- - Genentech/human-atac-catlas-data
11
  base_model:
12
  - Genentech/enformer-model
13
  ---
14
 
15
- # human-atac-catlas-model
16
 
17
  ## Model Description
18
- This model is a multi-task binary classifier trained to predict chromatin accessibility across 204 cell types. It was trained by fine-tuning the Enformer model using the `grelu` library on top of the CATlas human enhancer dataset.
19
 
20
- - **Architecture:** Fine-tuned Enformer
21
  - **Input:** Genomic sequences (hg38)
22
- - **Output:** Binary accessibility predictions for 204 cell type tasks.
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
23
 
24
  ## Repository Content
25
  1. `model.ckpt`: The trained model weights and hyperparameters (PyTorch Lightning checkpoint).
@@ -34,10 +90,10 @@ from grelu.lightning import LightningModel
34
  from huggingface_hub import hf_hub_download
35
 
36
  ckpt_path = hf_hub_download(
37
- repo_id="Genentech/human-atac-catlas-model",
38
  filename="model.ckpt"
39
  )
40
 
41
- model = LightningModel.load_from_checkpoint(ckpt_path)
42
  model.eval()
43
- ```
 
1
  ---
 
2
  license: mit
3
  library_name: pytorch-lightning
4
  pipeline_tag: tabular-classification
 
6
  - biology
7
  - genomics
8
  datasets:
9
+ - Genentech/human-chromhmm-fullstack-data
10
  base_model:
11
  - Genentech/enformer-model
12
  ---
13
 
14
+ # human-chromhmm-fullstack-model
15
 
16
  ## Model Description
17
+ This model is a multi-class classifier trained to predict chromatin state annotations for genomic DNA sequences. It classifies sequences into 16 chromatin states based on the ChromHMM fullstack annotation. It was trained by fine-tuning the Enformer model using the `grelu` library.
18
 
19
+ - **Architecture:** Fine-tuned Enformer (EnformerPretrainedModel)
20
  - **Input:** Genomic sequences (hg38)
21
+ - **Output:** Probability distribution over 16 chromatin states
22
+ - **Parameters:** 71.5M total (all trainable)
23
+
24
+ ### Chromatin States
25
+ Acet, BivProm, DNase, EnhA, EnhWk, GapArtf, HET, PromF, Quies, ReprPC, TSS, Tx, TxEnh, TxEx, TxWk, znf
26
+
27
+ ## Performance
28
+
29
+ Metrics are computed per chromatin state and averaged across all 16 states.
30
+
31
+ ### Test Set
32
+ | Metric | Mean | Std | Min | Max |
33
+ |--------|------|-----|-----|-----|
34
+ | Accuracy | 0.4373 | 0.2162 | 0.2455 | 0.8528 |
35
+ | AUROC | 0.8609 | 0.0767 | 0.7652 | 0.9952 |
36
+ | Average Precision | 0.4113 | 0.1974 | 0.1362 | 0.8015 |
37
+
38
+ ### Validation Set
39
+ | Metric | Mean | Std | Min | Max |
40
+ |--------|------|-----|-----|-----|
41
+ | Accuracy | 0.4487 | 0.2098 | 0.2164 | 0.8696 |
42
+ | AUROC | 0.8654 | 0.0763 | 0.7594 | 0.9950 |
43
+ | Average Precision | 0.4155 | 0.1848 | 0.1241 | 0.7812 |
44
+
45
+ ### Per-class Test Metrics
46
+ | State | Accuracy | AUROC | AvgPrec |
47
+ |-------|----------|-------|---------|
48
+ | Acet | 0.2939 | 0.7973 | 0.2091 |
49
+ | BivProm | 0.5431 | 0.9373 | 0.3575 |
50
+ | DNase | 0.8528 | 0.9905 | 0.7527 |
51
+ | EnhA | 0.2950 | 0.8145 | 0.3368 |
52
+ | EnhWk | 0.2683 | 0.8144 | 0.2947 |
53
+ | GapArtf | 0.7988 | 0.9517 | 0.7029 |
54
+ | HET | 0.2455 | 0.8236 | 0.4982 |
55
+ | PromF | 0.5940 | 0.9557 | 0.6369 |
56
+ | Quies | 0.3662 | 0.8512 | 0.3610 |
57
+ | ReprPC | 0.2874 | 0.7652 | 0.2522 |
58
+ | TSS | 0.8302 | 0.9952 | 0.8015 |
59
+ | Tx | 0.2590 | 0.8072 | 0.3197 |
60
+ | TxEnh | 0.2694 | 0.8252 | 0.2770 |
61
+ | TxEx | 0.5336 | 0.8821 | 0.3563 |
62
+ | TxWk | 0.2510 | 0.7781 | 0.2880 |
63
+ | znf | 0.3079 | 0.7851 | 0.1362 |
64
+
65
+ ## Training Details
66
+
67
+ | Parameter | Value |
68
+ |-----------|-------|
69
+ | Task | Multiclass classification |
70
+ | Loss | Binary Cross-Entropy (with class weights) |
71
+ | Optimizer | Adam |
72
+ | Learning rate | 0.0001 |
73
+ | Batch size | 512 |
74
+ | Max epochs | 10 |
75
+ | Devices | 4 |
76
+ | n_transformers | 1 |
77
+ | crop_len | 0 |
78
+ | grelu version | 1.0.4.post1.dev39 |
79
 
80
  ## Repository Content
81
  1. `model.ckpt`: The trained model weights and hyperparameters (PyTorch Lightning checkpoint).
 
90
  from huggingface_hub import hf_hub_download
91
 
92
  ckpt_path = hf_hub_download(
93
+ repo_id="Genentech/human-chromhmm-fullstack-model",
94
  filename="model.ckpt"
95
  )
96
 
97
+ model = LightningModel.load_from_checkpoint(ckpt_path, weights_only=False)
98
  model.eval()
99
+ ```