llm-jp-13b-instruct-v2.0-kokoroe
The llm-jp-13b-instruct-v2.0-kokoroe is a large language model fine-tuned to follow instructions in Japanese, with safety tuning applied to enhance response appropriateness.
This model is based on llm-jp/llm-jp-13b-instruct-full-dolly-ichikara_004_001_single-oasst-oasst2-v2.0.
Model Details
Model Description
- Developed by: Retrieva, Inc.
- Model type: Transformer-based Language Model (LlamaForCausalLM)
- Language(s) (NLP): Primarily Japanese
- License: Apache-2.0
- Finetuned from model: llm-jp/llm-jp-13b-instruct-full-dolly-ichikara_004_001_single-oasst-oasst2-v2.0
Uses
This section describes two ways to use the model:
- Huggingface/transformers library
- vLLM library
huggingface/transformers Usage
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
model_id = "retrieva-jp/llm-jp-13b-instruct-v2.0-kokoroe"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(model_id, device_map="auto", torch_dtype=torch.bfloat16)
chat = [
{"role": "system", "content": "以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。"},
{"role": "user", "content": "自然言語処理とは何か"},
]
tokenized_input = tokenizer.apply_chat_template(chat, add_generation_prompt=True, tokenize=True, return_tensors="pt").to(model.device)
with torch.no_grad():
output = model.generate(
tokenized_input,
max_new_tokens=100,
do_sample=True,
top_p=0.95,
temperature=0.7,
repetition_penalty=1.05,
)[0]
print(tokenizer.decode(output))
vLLM Usage
$ vllm serve --model retrieva-jp/llm-jp-13b-instruct-v2.0-kokoroe --port 8000
$ curl -X POST http://localhost:8000/generate \
-H "Content-Type: application/json" \
-d '{
"prompt": [
{"role": "system", "content": "以下は、タスクを説明する指示です。要求を適切に満たす応答を書きなさい。"},
{"role": "user", "content": "自然言語処理とは何か"}
],
"max_new_tokens": 100,
"do_sample": true,
"top_p": 0.95,
"temperature": 0.7,
"repetition_penalty": 1.05
}'
Model Card Authors
Satoru Katsumata
Model Card Contact
pr[at]retrieva.jp
- Downloads last month
- 2