QizhiPei commited on
Commit
5f87126
·
verified ·
1 Parent(s): 29e8dc3

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +4 -4
README.md CHANGED
@@ -11,7 +11,7 @@ tags:
11
  - foundation-model
12
  - pretrained
13
  pipeline_tag: text-generation
14
- base_model: Qwen/Qwen3-1.7B
15
  library_name: transformers
16
  ---
17
 
@@ -19,7 +19,7 @@ library_name: transformers
19
 
20
  **BioMatrix** is a multimodal biological foundation model that natively integrates **1D sequences**, **3D structures**, and **natural language** for both **molecules** and **proteins** within a single decoder-only architecture.
21
 
22
- This is the **1.7B-parameter Base model**, obtained via **multimodal continual pretraining** of Qwen3-1.7B on 304.4 billion tokens spanning text, molecular and protein 1D/3D data, and cross-modal corpora. This base checkpoint is intended for further fine-tuning on downstream tasks. For an instruction-tuned model ready for inference, see [BioMatrix-1.7B-SFT](https://huggingface.co/QizhiPei/BioMatrix-1.7B-SFT). For a larger model, see [BioMatrix-4B-Base](https://huggingface.co/QizhiPei/BioMatrix-4B-Base).
23
 
24
  - 📄 **Paper**: [BioMatrix: Towards a Comprehensive Biological Foundation Model Spanning the Modality Matrix of Sequences, Structures, and Language](https://arxiv.org/abs/xxxx.xxxxx)
25
  - 💻 **Code**: [https://github.com/QizhiPei/biomatrix](https://github.com/QizhiPei/biomatrix)
@@ -48,7 +48,7 @@ All modalities are consumed and produced uniformly under a **single next-token p
48
 
49
  ## Model Details
50
 
51
- - **Base Architecture**: Qwen3-1.7B
52
  - **Parameters**: 1.7B
53
  - **Training Stage**: Multimodal Continual Pretraining only (not instruction-tuned)
54
  - **Training Tokens**: 304.4B
@@ -158,4 +158,4 @@ If you find BioMatrix useful, please cite:
158
 
159
  ## License
160
 
161
- This model is released under the Apache 2.0 license. The base model (Qwen3-1.7B) is subject to its own license terms.
 
11
  - foundation-model
12
  - pretrained
13
  pipeline_tag: text-generation
14
+ base_model: Qwen/Qwen3-1.7B-Base
15
  library_name: transformers
16
  ---
17
 
 
19
 
20
  **BioMatrix** is a multimodal biological foundation model that natively integrates **1D sequences**, **3D structures**, and **natural language** for both **molecules** and **proteins** within a single decoder-only architecture.
21
 
22
+ This is the **1.7B-parameter Base model**, obtained via **multimodal continual pretraining** of Qwen3-1.7B-Base on 304.4 billion tokens spanning text, molecular and protein 1D/3D data, and cross-modal corpora. This base checkpoint is intended for further fine-tuning on downstream tasks. For an instruction-tuned model ready for inference, see [BioMatrix-1.7B-SFT](https://huggingface.co/QizhiPei/BioMatrix-1.7B-SFT). For a larger model, see [BioMatrix-4B-Base](https://huggingface.co/QizhiPei/BioMatrix-4B-Base).
23
 
24
  - 📄 **Paper**: [BioMatrix: Towards a Comprehensive Biological Foundation Model Spanning the Modality Matrix of Sequences, Structures, and Language](https://arxiv.org/abs/xxxx.xxxxx)
25
  - 💻 **Code**: [https://github.com/QizhiPei/biomatrix](https://github.com/QizhiPei/biomatrix)
 
48
 
49
  ## Model Details
50
 
51
+ - **Base Architecture**: Qwen3-1.7B-Base
52
  - **Parameters**: 1.7B
53
  - **Training Stage**: Multimodal Continual Pretraining only (not instruction-tuned)
54
  - **Training Tokens**: 304.4B
 
158
 
159
  ## License
160
 
161
+ This model is released under the Apache 2.0 license. The base model (Qwen3-1.7B-Base) is subject to its own license terms.