Unified Structure Generation for Universal Information Extraction
Paper • 2203.12277 • Published
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
UIE(Universal Information Extraction) is an SOTA method in PaddleNLP, you can see details here.
Paper is here
I save the UIE model as a entire model(Ernie 3.0 backbone + start/end layers), so you need to load model as:
git lfs install
git clone https://huggingface.co/xyj125/uie-base-chinese
If you don't have [git-lfs], you can also:
Files and versions] at Top Of This Card.import os
import torch
from transformers import AutoTokenizer
uie_model = 'uie-base-zh'
model = torch.load(os.path.join(uie_model, 'pytorch_model.bin')) # load UIE model
tokenizer = AutoTokenizer.from_pretrained('uie-base') # load tokenizer
...
start_prob, end_prob = model(input_ids=batch['input_ids'],
token_type_ids=batch['token_type_ids'],
attention_mask=batch['attention_mask']))
print(f'start_prob ({type(start_prob)}): {start_prob.size()}') # start_prob
print(f'end_prob ({type(end_prob)}): {end_prob.size()}') # end_prob
...
Here is the output of model (with batch_size=16, max_seq_len=256):
start_prob (<class 'torch.Tensor'>): torch.Size([16, 256])
end_prob (<class 'torch.Tensor'>): torch.Size([16, 256])