Text Generation
Transformers
Safetensors
English
qwen2
roleplay
chatml
unsloth
kemonomimi
anime
conversational
text-generation-inference
Instructions to use Crossie/Nayari with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use Crossie/Nayari with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="Crossie/Nayari") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("Crossie/Nayari") model = AutoModelForCausalLM.from_pretrained("Crossie/Nayari") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use Crossie/Nayari with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "Crossie/Nayari" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Crossie/Nayari", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/Crossie/Nayari
- SGLang
How to use Crossie/Nayari with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "Crossie/Nayari" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Crossie/Nayari", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "Crossie/Nayari" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "Crossie/Nayari", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Unsloth Studio new
How to use Crossie/Nayari with Unsloth Studio:
Install Unsloth Studio (macOS, Linux, WSL)
curl -fsSL https://unsloth.ai/install.sh | sh # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Crossie/Nayari to start chatting
Install Unsloth Studio (Windows)
irm https://unsloth.ai/install.ps1 | iex # Run unsloth studio unsloth studio -H 0.0.0.0 -p 8888 # Then open http://localhost:8888 in your browser # Search for Crossie/Nayari to start chatting
Using HuggingFace Spaces for Unsloth
# No setup required # Open https://huggingface.co/spaces/unsloth/studio in your browser # Search for Crossie/Nayari to start chatting
Load model with FastModel
pip install unsloth from unsloth import FastModel model, tokenizer = FastModel.from_pretrained( model_name="Crossie/Nayari", max_seq_length=2048, ) - Docker Model Runner
How to use Crossie/Nayari with Docker Model Runner:
docker model run hf.co/Crossie/Nayari
Update README.md
Browse files
README.md
CHANGED
|
@@ -7,7 +7,6 @@ language:
|
|
| 7 |
pipeline_tag: text-generation
|
| 8 |
---
|
| 9 |
|
| 10 |
-
|
| 11 |
# 🌸 Nayari AI
|
| 12 |
|
| 13 |
A fine-tuned AI companion character built on **Qwen 2.5 1.5B Instruct**, trained using **Unsloth + LoRA** on Kaggle's free GPU tier.
|
|
@@ -29,6 +28,7 @@ Nayari-AI/
|
|
| 29 |
│
|
| 30 |
├── nayari_build_dataset.ipynb # LOCAL — converts all files → nayari_dataset.json + uploads to Kaggle
|
| 31 |
├── nayari_train.ipynb # KAGGLE — fine-tunes Qwen 2.5 using the uploaded dataset
|
|
|
|
| 32 |
├── nayari_dataset.json # Auto-generated dataset (do not edit manually)
|
| 33 |
├── nayari_system_prompt.txt # Nayari's system prompt (baked into tokenizer at training time)
|
| 34 |
└── README.md
|
|
@@ -57,18 +57,23 @@ You will need a **Kaggle API token** for the upload step:
|
|
| 57 |
1. Go to [kaggle.com/code](https://kaggle.com/code) → **New Notebook** → Upload `nayari_train.ipynb`
|
| 58 |
2. Click **+ Add Data** → search for your uploaded `nayari-dataset` → Add
|
| 59 |
3. Set **Accelerator = GPU T4 x2** and **Internet = On**
|
| 60 |
-
4. Run cells **in order**
|
| 61 |
|
| 62 |
Training takes ~15–30 min on T4 x2.
|
| 63 |
|
| 64 |
-
### Step 3 —
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 65 |
|
| 66 |
-
1.
|
| 67 |
-
2.
|
| 68 |
-
3.
|
| 69 |
-
4.
|
| 70 |
-
5. Open `http://localhost:5001` — set **Instruct mode = ChatML**
|
| 71 |
-
6. Nayari's personality is baked in — no system prompt needed in the UI
|
| 72 |
|
| 73 |
---
|
| 74 |
|
|
|
|
| 7 |
pipeline_tag: text-generation
|
| 8 |
---
|
| 9 |
|
|
|
|
| 10 |
# 🌸 Nayari AI
|
| 11 |
|
| 12 |
A fine-tuned AI companion character built on **Qwen 2.5 1.5B Instruct**, trained using **Unsloth + LoRA** on Kaggle's free GPU tier.
|
|
|
|
| 28 |
│
|
| 29 |
├── nayari_build_dataset.ipynb # LOCAL — converts all files → nayari_dataset.json + uploads to Kaggle
|
| 30 |
├── nayari_train.ipynb # KAGGLE — fine-tunes Qwen 2.5 using the uploaded dataset
|
| 31 |
+
├── nayari_export.ipynb # KAGGLE — exports the fine-tuned model to GGUF, HuggingFace, etc.
|
| 32 |
├── nayari_dataset.json # Auto-generated dataset (do not edit manually)
|
| 33 |
├── nayari_system_prompt.txt # Nayari's system prompt (baked into tokenizer at training time)
|
| 34 |
└── README.md
|
|
|
|
| 57 |
1. Go to [kaggle.com/code](https://kaggle.com/code) → **New Notebook** → Upload `nayari_train.ipynb`
|
| 58 |
2. Click **+ Add Data** → search for your uploaded `nayari-dataset` → Add
|
| 59 |
3. Set **Accelerator = GPU T4 x2** and **Internet = On**
|
| 60 |
+
4. Run cells **in order**.
|
| 61 |
|
| 62 |
Training takes ~15–30 min on T4 x2.
|
| 63 |
|
| 64 |
+
### Step 3 — Export & Download (run on Kaggle)
|
| 65 |
+
|
| 66 |
+
1. After training, open `nayari_export.ipynb` in your Kaggle environment (or upload it to a new notebook with the same workspace context).
|
| 67 |
+
2. Run the cells to generate LoRA adapters, merged 16-bit, and GGUF outputs.
|
| 68 |
+
3. Use the **Cloudflare Tunnel** cell to get direct HTTP download links for the generated `.gguf` files.
|
| 69 |
+
4. Download `nayari-Q4_K_M.gguf` (fast) or `nayari-Q8_0.gguf` (higher quality).
|
| 70 |
+
|
| 71 |
+
### Step 4 — Run with KoboldCpp (run locally)
|
| 72 |
|
| 73 |
+
1. Install [KoboldCpp](https://github.com/LostRuins/koboldcpp/releases)
|
| 74 |
+
2. Launch: `koboldcpp.exe nayari-Q4_K_M.gguf --contextsize 4096`
|
| 75 |
+
3. Open `http://localhost:5001` — set **Instruct mode = ChatML**
|
| 76 |
+
4. Nayari's personality is baked in — no system prompt needed in the UI
|
|
|
|
|
|
|
| 77 |
|
| 78 |
---
|
| 79 |
|