adanish91 commited on
Commit
51d8f1e
·
verified ·
1 Parent(s): 086096f

Upload README.md

Browse files
Files changed (1) hide show
  1. README.md +44 -0
README.md ADDED
@@ -0,0 +1,44 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ base_model: bert-base-uncased
3
+ tags:
4
+ - safety
5
+ - occupational-safety
6
+ - bert
7
+ - domain-adaptation
8
+ ---
9
+
10
+ # SafetyBERT
11
+
12
+ SafetyBERT is a BERT model fine-tuned on occupational safety data from MSHA, OSHA, NTSB, and other safety organizations, as well as a large corpus of occupational safety-related Abstracts.
13
+
14
+ ## Quick Start
15
+
16
+ ```python
17
+ from transformers import AutoTokenizer, AutoModelForMaskedLM
18
+
19
+ tokenizer = AutoTokenizer.from_pretrained("bert-base-cased")
20
+ model = AutoModelForMaskedLM.from_pretrained("adanish91/safetybert")
21
+
22
+ # Example usage
23
+ text = "The worker failed to wear proper [MASK] equipment."
24
+ inputs = tokenizer(text, return_tensors="pt")
25
+ outputs = model(**inputs)
26
+ ```
27
+
28
+ ## Model Details
29
+
30
+ - **Base Model**: bert-base-uncased
31
+ - **Parameters**: 110M
32
+ - **Training Data**: 2.4M safety documents from multiple sources
33
+ - **Specialization**: Mining, construction, transportation safety
34
+
35
+ ## Performance
36
+
37
+ Significantly outperforms BERT-base on safety classification tasks:
38
+ - 76.9% improvement in pseudo-perplexity
39
+ - Superior performance on Occupational safety-related downstream tasks
40
+
41
+ ## Applications
42
+
43
+ - Safety document analysis
44
+ - Incident report classification