1. 使い方

Google ColabでランタイムをL4に設定して以下を実行する。
実行後現れるフォームに処理するjsonlのパスを入れる。
20-40分程待つと処理を記録したjsonlとmdがルートディレクトリに作成される。

補足

生成結果にばらつきがあるので、試行ごとに0.1程の間でスコアが変わります。ベストスコアで提出しているので、テスト時にはスコアが下がる可能性が高いです。

# 必要なライブラリのインストール
!pip install -U bitsandbytes
!pip install -U transformers
!pip install -U accelerate
!pip install -U datasets
!pip install -U peft
!pip install ipywidgets --upgrade

import torch
from transformers import (
    AutoModelForCausalLM,
    AutoTokenizer,
    BitsAndBytesConfig,
)
from peft import PeftModel
import time
from tqdm import tqdm
import json
import re
from huggingface_hub import login

# 推論用データのパスをユーザ入力
data_path = input("使用するデータのjsonlファイルパスを入力してください：")

# 使用するモデル名を入力で指定
model_name = "katsukiono/llm-jp-3-13b_all_v2-all_all-v3_20241216034359"

# LoRAアダプタID（Hugging Face上にアップしたアダプタのパス）
adapter_id = model_name # ここにLoRAアダプタIDを設定してください(空のままだと適用しない)

# モデル設定
max_seq_length = 2048
dtype = None
load_in_4bit = True

# QLoRA config
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.bfloat16,
)

# ベースモデル読み込み
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    quantization_config=bnb_config,
    device_map="auto",
    trust_remote_code=True
)

tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)

# LoRAアダプタ適用（必要な場合のみ）
if adapter_id:
    model = PeftModel.from_pretrained(model, adapter_id)

print("テストデータセットをロード中...")
test_data = []
with open(data_path, 'r', encoding='utf-8') as f:
    for line in f:
        line = line.strip()
        if line:
            obj = json.loads(line)
            test_data.append({
                "task_id": obj["task_id"],
                "input": obj["input"]
            })

# answer抽出用のパターン
answer_pattern = re.compile(r'###\s*回答', re.IGNORECASE)

# repetition_penaltyの候補
repetition_penalties = [1.2, 1.1, 1.3, 1.0, 1.4, 0.8, 0.9, 0.7, 0.6]

# プロンプト群
prompt = f"""
あなたは厳密に指示に従って、回答を生成する厳格で優秀なアシスタントです。
重要：必ず、良く指示を理解して、指示内容を厳守して、回答を生成してください。

### 指示
{{input_text}}
### 回答
"""
prompt_2 = f"""
### 指示
{{input_text}}
### 回答
"""
prompt_3 = f"""
### 指示
以下の指示の要件を良く理解して回答を出力してください。
ポイント:指示された回答の形式を厳守することが重要です。
{{input_text}}
### 回答
"""
prompt_4 = f"""
### 指示
あなたは厳密に指示に従って、回答を生成する厳格で優秀なアシスタントです。
重要：必ず、良く指示を理解して、指示内容を厳守して、回答を生成してください。
{{input_text}}
### 回答
"""
prompt_5 = f"""
以下の指示の要件を良く理解して回答を出力してください。
ポイント:指示された回答の形式を厳守することが重要です。

### 指示
{{input_text}}
### 回答
"""
prompts = [prompt, prompt_2, prompt_3, prompt_4, prompt_5]

results = []
start_time = time.time()

for dt in tqdm(test_data):
    input_text = dt["input"]

    prediction = ''
    max_empty_retries = 500
    empty_retries = 0
    penalty_index = 0
    prompt_index = 0

    while True:
        current_penalty = repetition_penalties[penalty_index]
        current_prompt = prompts[prompt_index].format(input_text=input_text)

        # 入力エンコード
        inputs = tokenizer([current_prompt], return_tensors="pt").to(model.device)
        
        # token_type_idsがあれば削除
        if 'token_type_ids' in inputs:
            del inputs['token_type_ids']
            
        # 推論実行
        outputs = model.generate(
            **inputs,
            max_new_tokens=1024,
            use_cache=True,
            do_sample=False,
            repetition_penalty=current_penalty,
            pad_token_id=tokenizer.eos_token_id
        )

        prediction_full = tokenizer.decode(
            outputs[0],
            skip_special_tokens=True
        )

        # プロンプト部分の除去
        if prediction_full.startswith(current_prompt):
            prediction_full = prediction_full[len(current_prompt):].strip()

        match = answer_pattern.search(prediction_full)
        if match:
            prediction = prediction_full[match.end():].strip()
        else:
            prediction = prediction_full.strip()

        # "<answer>" は空とみなす
        if prediction == "<answer>":
            prediction = ""

        print("final prediction")
        print(prediction)

        # Clear GPU memory
        del inputs, outputs
        torch.cuda.empty_cache()

        if prediction == '':
            empty_retries += 1
            if empty_retries >= max_empty_retries:
                print(f"タスクID {dt['task_id']} の空出力の最大リトライ回数に達しました。")
                break
            else:
                prompt_index += 1
                if prompt_index >= len(prompts):
                    prompt_index = 0
                    penalty_index = (penalty_index + 1) % len(repetition_penalties)
                torch.cuda.empty_cache()
                continue
        else:
            break

    results.append({
        "task_id": dt["task_id"],
        "input": input_text,
        "output": prediction
    })

end_time = time.time()
elapsed_time = end_time - start_time
print(f"生成にかかった時間: {elapsed_time:.2f}秒")

# 結果をjsonlで保存
output_file_name = f"{model_name.replace('/', '_')}_output.jsonl"
with open(output_file_name, 'w', encoding='utf-8') as f:
    for result in results:
        json.dump(result, f, ensure_ascii=False)
        f.write('\n')

print(f"結果が {output_file_name} に保存されました。")

# 結果をMarkdownで保存
md_file_name = f"{model_name.replace('/', '_')}_results.md"
with open(md_file_name, 'w', encoding='utf-8') as f:
    for result in results:
        f.write(f"## タスクID: {result['task_id']}\n\n")
        f.write(f"### 入力:\n{result['input']}\n\n")
        f.write(f"### 出力:\n{result['output']}\n\n")
        f.write("---\n\n")

print(f"Markdownファイルが {md_file_name} に保存されました。")

Model Details

Model Description

This repository contains a fine-tuned model which was trained using synthetic data generated by a Mistral Model under the Mistral AI Research License. No Mistral Model weights or source code are included in this repository. Please note that this model and its outputs may only be used for research purposes as defined by Section 3.2 of the Mistral AI Research License.

Disclaimer

The synthetic data used for training this model was generated exclusively for non-commercial, research purposes as required by the Mistral AI Research License.
Mistral AI does not endorse or verify the contents, performance, or quality of this model.
Please consult the Mistral AI Research License for any additional restrictions or usage guidelines: https://mistral.ai/licenses/MRL-0.1.md.

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support