Text Generation
Transformers
Safetensors
English
gemma3_text
gemma
finetune
qlora
chatbot
tars
conversational
text-generation-inference
Instructions to use am-om/tars_ai with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use am-om/tars_ai with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-generation", model="am-om/tars_ai") messages = [ {"role": "user", "content": "Who are you?"}, ] pipe(messages)# Load model directly from transformers import AutoTokenizer, AutoModelForCausalLM tokenizer = AutoTokenizer.from_pretrained("am-om/tars_ai") model = AutoModelForCausalLM.from_pretrained("am-om/tars_ai") messages = [ {"role": "user", "content": "Who are you?"}, ] inputs = tokenizer.apply_chat_template( messages, add_generation_prompt=True, tokenize=True, return_dict=True, return_tensors="pt", ).to(model.device) outputs = model.generate(**inputs, max_new_tokens=40) print(tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:])) - Notebooks
- Google Colab
- Kaggle
- Local Apps
- vLLM
How to use am-om/tars_ai with vLLM:
Install from pip and serve model
# Install vLLM from pip: pip install vllm # Start the vLLM server: vllm serve "am-om/tars_ai" # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:8000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "am-om/tars_ai", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker
docker model run hf.co/am-om/tars_ai
- SGLang
How to use am-om/tars_ai with SGLang:
Install from pip and serve model
# Install SGLang from pip: pip install sglang # Start the SGLang server: python3 -m sglang.launch_server \ --model-path "am-om/tars_ai" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "am-om/tars_ai", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }'Use Docker images
docker run --gpus all \ --shm-size 32g \ -p 30000:30000 \ -v ~/.cache/huggingface:/root/.cache/huggingface \ --env "HF_TOKEN=<secret>" \ --ipc=host \ lmsysorg/sglang:latest \ python3 -m sglang.launch_server \ --model-path "am-om/tars_ai" \ --host 0.0.0.0 \ --port 30000 # Call the server using curl (OpenAI-compatible API): curl -X POST "http://localhost:30000/v1/chat/completions" \ -H "Content-Type: application/json" \ --data '{ "model": "am-om/tars_ai", "messages": [ { "role": "user", "content": "What is the capital of France?" } ] }' - Docker Model Runner
How to use am-om/tars_ai with Docker Model Runner:
docker model run hf.co/am-om/tars_ai
| library_name: transformers | |
| license: apache-2.0 | |
| language: | |
| - en | |
| base_model: google/gemma-3-1b-it | |
| tags: | |
| - gemma | |
| - finetune | |
| - qlora | |
| - chatbot | |
| - tars | |
| # Model Card for TARS (Gemma 3 1B Fine-tune) | |
| This is a fine-tuned version of `google/gemma-3-1b-it` trained to act as the **TARS astronaut assistant** from *Interstellar*. | |
| It is designed to be professional for tasks but witty for off-topic chat, and its responses are guided by a simulated user emotion tag. | |
| --- | |
| ## Model Details | |
| ### Model Description | |
| This model is a QLoRA fine-tune of `google/gemma-3-1b-it` on a custom synthetic dataset. | |
| The goal was to create a chatbot that embodies the **TARS persona**: | |
| - **Task-Oriented:** Professional, direct, and helpful for mission-related queries. | |
| - **Persona-Driven:** Witty, empathetic, or humorous for off-topic or personal chat. | |
| - **Emotion-Aware:** The model's response style is influenced by a `[Detected Emotion: ...]` tag. | |
| **Developed by:** (huggingface.co/am-om) | |
| **Shared by:** (Om Singh) | |
| **Model type:** Causal Language Model | |
| **Language(s):** English (`en`) | |
| **License:** apache-2.0 | |
| **Finetuned from model:** `google/gemma-3-1b-it` | |
| --- | |
| ## Model Sources (optional) | |
| - **Repository:** [https://huggingface.co/am-om/tars_ai] | |
| --- | |
| ## Uses | |
| ### Direct Use | |
| This model is intended for **direct use as a chatbot**, following a specific prompt format. | |
| ⚠️ **Important:** This model requires a specific prompt format that includes a detected emotion. | |
| Do **not** send raw text as the user query. | |
| #### Prompt Format | |
| The user turn *must* follow this structure: | |
| ``` | |
| [Detected Emotion: {emotion}] | |
| [User Query: {your_text_here}] | |
| ``` | |
| **Example:** | |
| ``` | |
| [Detected Emotion: anxious] | |
| [User Query: Are we going to make it?] | |
| ``` | |
| ### Out-of-Scope Use | |
| This model is not intended for: | |
| * Any use without the required `[Detected Emotion: ...]` and `[User Query: ...]` tags. | |
| * Use as a base model for further fine-tuning. | |
| * Any critical decision-making without human oversight. | |
| ## How to Get Started with the Model | |
| Use the code below to get started with the model. | |
| ```python | |
| from transformers import AutoTokenizer, AutoModelForCausalLM, pipeline | |
| import torch | |
| # Load the model from the Hub | |
| model_id = "am-om/tars_ai" | |
| model = AutoModelForCausalLM.from_pretrained( | |
| model_id, | |
| device_map="auto", | |
| torch_dtype=torch.bfloat16 | |
| ) | |
| tokenizer = AutoTokenizer.from_pretrained(model_id) | |
| pipe = pipeline( | |
| "text-generation", | |
| model=model, | |
| tokenizer=tokenizer | |
| ) | |
| # --- Define your chat history --- | |
| # The system prompt is automatically loaded from the tokenizer's chat template. | |
| messages = [] | |
| # Example query | |
| user_query = "I'm feeling a bit lonely out here." | |
| emotion = "sad" | |
| # Format the input correctly! | |
| formatted_input = f"[Detected Emotion: {emotion}]\n[User Query: {user_query}]" | |
| messages.append({"role": "user", "content": formatted_input}) | |
| # --- Generate the response --- | |
| prompt = pipe.tokenizer.apply_chat_template( | |
| messages, | |
| tokenize=False, | |
| add_generation_prompt=True | |
| ) | |
| outputs = pipe( | |
| prompt, | |
| max_new_tokens=256, | |
| do_sample=True, | |
| temperature=0.7, | |
| top_p=0.95, | |
| pad_token_id=pipe.tokenizer.eos_token_id | |
| ) | |
| # Extract and print just the new response | |
| response = outputs[0]["generated_text"][len(prompt):].strip() | |
| print(f"TARS: {response}") | |
| ``` | |
| ## Training Details | |
| ### Training Data | |
| This model was fine-tuned on a custom, synthetically-generated dataset of 344 prompt/response pairs. The dataset was designed to teach the model to differentiate between task-oriented and persona-driven queries based on the emotion tag. | |
| ### Training Procedure | |
| The model was fine-tuned using QLoRA for 3 epochs. The adapter (from checkpoint-156, the best-performing epoch) was then merged with the base model. | |
| #### Training Hyperparameters | |
| * **Framework:** TRL (Transformer Reinforcement Learning) | |
| * **Quantization:** 4-bit (bnb_4bit_quant_type="nf4") | |
| * **LoRA `r`:** 16 | |
| * **LoRA `alpha`:** 32 | |
| * **LoRA `dropout`:** 0.05 | |
| * **Optimizer:** paged_adamw_8bit | |
| * **Learning Rate:** 5e-5 | |
| * **LR Scheduler:** constant | |
| * **Epochs:** 3 | |
| * **Batch Size:** 4 | |
| ## Environmental Impact | |
| Carbon emissions can be estimated using the [Machine Learning Impact calculator](https://mlco2.github.io/impact#compute) presented in [Lacoste et al. (2019)](https://arxiv.org/abs/1910.09700). | |
| * **Hardware Type:** NVIDIA T4 | |
| * **Hours used:** ~4hours | |
| * **Cloud Provider:** Google Colab | |
| * **Compute Region:** (e.g., us-central1 - *check your Colab instance*) | |
| * **Carbon Emitted:** ~5.5 g CO2eq (Estimated) | |
| ## Technical Specifications [optional] | |
| ### Model Architecture and Objective | |
| This is a standard decoder-only Transformer (Gemma 3) fine-tuned with a Causal Language Modeling objective. | |
| ### Compute Infrastructure | |
| #### Hardware | |
| * NVIDIA T4 16GB (Google Collab ) | |
| #### Software | |
| * `transformers` | |
| * `trl` | |
| * `bitsandbytes` | |
| * `accelerate` | |
| * `peft` | |
| ## Model Card Authors [optional] | |
| (Om Singh)(huggingface.co/am-om) | |
| ## Model Card Contact | |
| (huggingface.co/am-om) | |