manglu3935 commited on
Commit
7f41fcf
·
1 Parent(s): e7dc50a
Files changed (1) hide show
  1. README.md +63 -1
README.md CHANGED
@@ -5,4 +5,66 @@ language:
5
  base_model:
6
  - Qwen/Qwen3-8B
7
  pipeline_tag: text-generation
8
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  base_model:
6
  - Qwen/Qwen3-8B
7
  pipeline_tag: text-generation
8
+ ---
9
+
10
+
11
+ # 🧬 Thoth
12
+
13
+ **Thoth** is a lightweight version of Thoth, designed for **efficient and scalable biological protocol generation** while retaining strong scientific reasoning ability.
14
+
15
+ - 📄 **Paper**: *Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism* (ICLR 2026)
16
+ - 🔗 **GitHub**: https://github.com/manglu097/Thoth
17
+ - 🤗 **Dataset**: https://huggingface.co/datasets/manglu3935/SciRecipe
18
+
19
+ ---
20
+
21
+ ## 🔍 Model Overview
22
+
23
+ - **Base model**: Qwen3-8B
24
+ - **Parameters**: 8B
25
+ - **GPU memory**: ~16GB
26
+ - **Primary task**: Biological experimental protocol generation
27
+
28
+ Thoth is trained with the same **Sketch-and-Fill paradigm** and **SCORE reward mechanism** as Thoth, offering a strong performance–efficiency trade-off.
29
+
30
+ ---
31
+
32
+ ## 🧠 Output Format
33
+
34
+ ```
35
+ <think> reasoning and planning </think>
36
+ <key> structured machine-readable steps </key>
37
+ <orc> natural language protocol </orc>
38
+ <note> optional safety notes </note>
39
+ ```
40
+
41
+ ---
42
+
43
+ ## 🚀 Usage
44
+
45
+ ```python
46
+ from transformers import AutoModelForCausalLM, AutoTokenizer
47
+
48
+ tokenizer = AutoTokenizer.from_pretrained("manglu3935/Thoth")
49
+ model = AutoModelForCausalLM.from_pretrained("manglu3935/Thoth")
50
+ ```
51
+
52
+ ---
53
+
54
+ ## ⚠️ Intended Use
55
+
56
+ For fast scientific reasoning experiments and scalable research deployment.
57
+ Generated protocols must be reviewed by qualified experts prior to laboratory execution.
58
+
59
+ ---
60
+
61
+ ## 📖 Citation
62
+
63
+ ```bibtex
64
+ @article{sun2025unleashing,
65
+ title={Unleashing Scientific Reasoning for Bio-experimental Protocol Generation via Structured Component-based Reward Mechanism},
66
+ author={Sun, Haoran and Jiang, Yankai and Tang, Zhenyu and others},
67
+ journal={arXiv preprint arXiv:2510.15600},
68
+ year={2025}
69
+ }
70
+ ```