PHIタグ推論モデル

Llama-3.1-Swallow-8B-Instruct-v0.5 をベースに、日本語医療テキストのPHIタグ付与タスクにLoRAでファインチューニングしたマージ済みモデルです。

ベースモデル

tokyotech-llm/Llama-3.1-Swallow-8B-Instruct-v0.5

タスク

入力テキスト中の個人情報（PHI: Protected Health Information）に対してタグを付与します。

タグ	対象
`<phi_age>`	年齢
`<phi_id>`	識別番号
`<phi_tel>`	電話番号
`<phi_job>`	職業
`<phi_location>`	住所・地名
`<phi_person>`	人名
`<phi_hospital>`	医療機関名

使用方法

import torch
from transformers import AutoTokenizer, AutoModelForCausalLM, BitsAndBytesConfig

model_id = "sociocom/MedPHINER-Llama-3.1-Swallow-8B-Instruct-v0.5"

tokenizer = AutoTokenizer.from_pretrained(model_id)

quant_cfg = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_compute_dtype=torch.bfloat16,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
)

model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    dtype=torch.bfloat16,
    quantization_config=quant_cfg,
)
model.eval()