textrm-28M-bizmail
A 28.19M parameter Transformer-based model that generates surprisingly coherent business-style emails.
Github: https://github.com/kamisori-daijin/textrm
v1.5 is Here : https://huggingface.co/Kamisori-daijin/textrm1.5-25M-bizmail
Overview
This project explores how far a small (~28M parameter) model can go in generating structured business email text.
The model is not fully instruction-following and may produce inconsistent or mixed outputs, but it can often generate realistic email-like text.
Features
- Small size (~28M parameters)
- Generates business-style email text
- Works best with simple prompts
- Occasionally produces surprisingly coherent outputs
Limitations
- Weak instruction following
- May mix multiple prompts or contexts
- Inconsistent tone and intent
- Not suitable for production use
Example
Prompt: Write a polite refusal email
Output: Write a polite refusal email and the company's well. Regarding [Company Name]'s AI-driven shift in [Project Name], I was awarded the [Award Name] for [Company Name] during this event. I was experiencing some unprecedented challenges and requires immediate attention. During this event, we've identified and updated the report. We have identified [brief, 1-2 key areas of feedback - e.g., increased customer development, lead our focus on [brief, neutral reason - e.g., 24-48 hours].
We’ve reviewed the updated prototype, and I need a concise and detailed explanation of the revised prototype by [Date]. We can discuss this further and explore a comprehensive approach to your clients.
Training
Architecture: TRM (custom Transformer variant)
Parameters: 28.19M
Dataset: Synthetic business email dataset (generated using Gemma3-4B) Link: https://huggingface.co/datasets/Kamisori-daijin/email-datasets-20k
Training epochs: 15
Usage
git clone https://github.com/kamisori-daijin/textrm.git
clone this repo
Move the cloned final_model.safetensors to the textrm folder.
cd textrm
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
python inference.py
Notes
This model is intended for research and experimentation purposes only.
License
Apache2
Disclaimer
This model was trained on synthetic data generated using Gemma3-4B (Google). This project is independent and does not replicate or fine-tune Gemma3-4B.