Gemma-3n-E4B-It - Psychiatric Translation LoRA Adapter (Hindi/Kannada to English)

This is a LoRA (Low-Rank Adaptation) adapter for Google's Gemma-3n-E4B-It model, fine-tuned for translating psychiatric interview transcripts from Hindi and Kannada to English while preserving clinical context and conversational nuance.

⚠️ Content Warning & Gated Repository

This model is behind a gated repository for important safety reasons:

Since this model was trained on real-world clinical psychiatric conversations, there is an anticipated risk that model outputs may contain sensitive content related to mental health topics. This model is intended solely for research purposes and clinical translation tasks by qualified teams.

To request access: Please contact the repository owner with:

  • Your research affiliation or clinical organization
  • Intended use case
  • Confirmation of ethical approval for your project (if applicable)

Model Description

  • Base Model: google/gemma-3n-e4b-it
  • Adapter Type: LoRA (Low-Rank Adaptation)
  • Source Languages: Hindi, Kannada
  • Target Language: English
  • Task: Translation of psychiatric interview transcripts
  • Domain: Psychiatric interviews and therapy sessions
  • PEFT Version: 0.18.0

LoRA Configuration

  • Rank (r): 16
  • Alpha: 32
  • Dropout: 0.09
  • RSLoRA: Enabled
  • Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj

Performance

The fine-tuned model shows significant improvements in translation quality over the base Gemma-3n-E4B-It model across both Hindi and Kannada psychiatric interview transcripts.

Test Set Results (4,139 aligned sentences)

Combined Results (Both Languages)

Metric Baseline (Mean ± SD) Fine-tuned Model (Mean ± SD) Improvement
METEOR Score 0.5709 ± 0.2719 0.7018 ± 0.2668 +22.93% absolute
TER (Translation Error Rate) 97.36 ± 167.26 74.26 ± 149.36 -23.73% reduction

Language-Specific Results

Hindi (1,854 test sentences):

  • METEOR: 0.6929 ± 0.2253 → 0.8329 ± 0.1697 (+14.01% absolute improvement)
  • TER: 51.61 ± 60.52 → 31.98 ± 30.51 (-19.63% reduction)

Kannada (2,285 test sentences):

  • METEOR: 0.4774 ± 0.2953 → 0.5963 ± 0.3089 (+11.89% absolute improvement)
  • TER: 133.52 ± 231.32 → 108.78 ± 209.27 (-24.74% reduction)

Statistical Significance

All reported improvements are statistically significant, indicating substantial enhancement in translation quality across both languages.

Translation Prompt

This model uses a carefully designed prompt that ensures clinically accurate translations while preserving conversational nuance. The prompt emphasizes:

Key Translation Guidelines:

  1. Meaning Over Literal Translation

    • Prioritize preservation of clinical meaning and context
    • Translate culture-specific idioms to convey meaning, not literal words
  2. Conversational Style Preservation

    • Maintain formality level (formal ↔ formal, informal ↔ informal)
    • Preserve emotional intensity and tone
  3. Verbatim Accuracy

    • Include all punctuation, exclamations, and emotional markers
    • Preserve special markers like [pauses], [loudly], etc.
    • Do not add or omit information
    • Maintain repetitions, stuttering, and dysfluencies
  4. Clinical Context Preservation

    • Ensure English-speaking clinicians reach the same diagnostic conclusions as source-language clinicians
    • Preserve speaker tags and timestamps
    • No polishing or correction of the translation

Prompt File: prompts/translation_prompt.txt

The prompt template uses {language} placeholder which is replaced with "Hindi" or "Kannada" during inference.

Training Data

The model was fine-tuned on aligned English-Hindi and English-Kannada sentence pairs extracted from psychiatric interview transcripts.

Training Set (35,802 aligned sentences):

  • Hindi: 19,996 sentence pairs
  • Kannada: 15,806 sentence pairs

Test Set (4,139 aligned sentences):

  • Hindi: 1,854 sentence pairs
  • Kannada: 2,285 sentence pairs

Total Dataset: 39,941 aligned sentence pairs across both languages

All data comes from real psychiatric interviews and therapy sessions, providing domain-specific translation patterns crucial for clinical accuracy.

Usage

Required Prompt

The model requires the translation prompt for optimal performance. This prompt is included in the prompts/ directory:

  • Translation Prompt: prompts/translation_prompt.txt

This prompt contains detailed instructions for preserving clinical context, conversational style, and emotional nuance while translating psychiatric content.

Translation Example

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model and tokenizer
model_id = "google/gemma-3n-e4b-it"
tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    torch_dtype=torch.bfloat16,
    device_map="auto"
)

# Load LoRA adapter
model = PeftModel.from_pretrained(
    model,
    "Lekhansh/gemma-3n-e4b-psychiatry-translation-indic-en"
)

# Load the translation prompt
with open("prompts/translation_prompt.txt", "r") as f:
    prompt_template = f.read()

# Example: Translating from Hindi
source_language = "Hindi"  # or "Kannada"
source_text = """डॉक्टर: आपकी नींद कैसी है?
मरीज: बहुत बुरी है, रात भर नहीं आती।"""

# Format the prompt
prompt = prompt_template.replace("{language}", source_language)
prompt = prompt + "\n" + source_text

# Prepare messages
messages = [
    {
        "role": "system",
        "content": [
            {
                "type": "text",
                "text": "You are an expert translator specializing in psychiatric interviews.",
            }
        ],
    },
    {
        "role": "user",
        "content": [{"type": "text", "text": prompt}]
    }
]

# Apply chat template and generate
inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    tokenize=True,
    return_tensors="pt",
)
inputs = inputs.to(model.device)

with torch.inference_mode():
    generation = model.generate(
        inputs,
        max_new_tokens=2048,
        temperature=0.7,
        top_k=50,
        top_p=0.95,
        do_sample=True
    )

translation = tokenizer.decode(generation[0], skip_special_tokens=True)
# Extract only the assistant's response
translation = translation.split("<start_of_turn>model")[-1].strip()
print(translation)

Merging Adapter with Base Model (Optional)

For faster inference, you can merge the adapter weights with the base model:

from transformers import AutoModelForCausalLM, AutoTokenizer
from peft import PeftModel
import torch

# Load base model and adapter
base_model = AutoModelForCausalLM.from_pretrained(
    "google/gemma-3n-e4b-it",
    torch_dtype=torch.bfloat16
)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-3n-e4b-it")

model = PeftModel.from_pretrained(
    base_model,
    "Lekhansh/gemma-3n-e4b-psychiatry-translation-indic-en"
)

# Merge and unload
merged_model = model.merge_and_unload()

# Save merged model
merged_model.save_pretrained("./merged_translation_model")
tokenizer.save_pretrained("./merged_translation_model")

Translation Quality Analysis

The fine-tuned model demonstrates substantial improvements:

METEOR Score Improvements

  • Hindi: +14.01% absolute improvement (0.6929 → 0.8329)
    • Standard deviation reduced from 0.2253 to 0.1697 (more consistent)
  • Kannada: +11.89% absolute improvement (0.4774 → 0.5963)
    • Baseline lower due to linguistic distance, but significant gains achieved

Translation Error Rate (TER) Reductions

  • Hindi: -19.63% reduction (51.61 → 31.98)
    • Error rate nearly halved
  • Kannada: -24.74% reduction (133.52 → 108.78)
    • Substantial error reduction despite higher baseline

Clinical Significance

  • Higher METEOR scores indicate better semantic alignment with reference translations
  • Reduced TER means fewer edits needed to match human translations
  • Improved consistency (lower std dev) ensures reliable clinical usage

Intended Use

Primary Use Cases

  • Clinical Research: Translation of psychiatric interview transcripts from Hindi/Kannada to English
  • Medical Documentation: Enabling English-speaking clinicians to review non-English therapy sessions
  • Cross-linguistic Clinical Studies: Facilitating research across language barriers in psychiatric care
  • Training Material Translation: Converting clinical training materials while preserving therapeutic context

Out-of-Scope Use

  • Real-time Interpretation: This model is designed for transcript translation, not live interpretation
  • General Purpose Translation: Optimized for psychiatric domain; may not generalize to other contexts
  • Diagnostic Tool: Should not be used as a standalone diagnostic instrument
  • Replacement for Professional Translators: Outputs should be reviewed by qualified bilingual clinical professionals

Limitations and Biases

  • Domain-Specific: Optimized for psychiatric/clinical conversations; performance on general text may be suboptimal
  • Language Pair Constraints: Specifically tuned for Hindi→English and Kannada→English; not bidirectional
  • Cultural Context: May not fully capture all culture-specific nuances without human review
  • Variable Performance by Language: Kannada translations show higher baseline error rates due to greater linguistic distance from English
  • Hallucination Risk: As with all LLMs, the model may occasionally generate content not present in the source
  • Data Privacy: Trained on real clinical data; users must ensure HIPAA/GDPR compliance

Ethical Considerations

  • This model was trained on real psychiatric conversations. Users must ensure:
    • Appropriate ethical approvals and patient consent for any clinical use
    • Compliance with data protection regulations (HIPAA, GDPR, etc.)
    • Professional review of all translations by qualified bilingual clinicians
    • Proper data security and patient confidentiality protocols
  • Translations should not be used for clinical decision-making without human verification
  • The model should not be used for unauthorized translation of clinical sessions
  • Consider potential biases in training data that may affect translation quality across different demographic groups

Training Details

Training Configuration

  • Optimizer: AdamW
  • Learning Rate: 5e-06 with cosine schedule with restarts (2 cycles)
  • Warmup Ratio: 0.1
  • Weight Decay: 0.09
  • Batch Size: 1 per device with 30 gradient accumulation steps
  • Epochs: 2.0
  • Max Sequence Length: 5000 tokens
  • Precision: bfloat16
  • Gradient Checkpointing: Enabled

Data Filtering

Examples exceeding 5000 tokens combined length were filtered out to ensure training stability and efficiency.

License

This adapter is released under the Apache 2.0 license. However, users must also comply with Google's Gemma license and any applicable regulations regarding clinical data and patient privacy.

Citation

If you use this model in your research, please cite:

@misc{gemma-psychiatry-translation-indic,
  title={Gemma-3n-E4B-It LoRA Adapter for Psychiatric Translation: Hindi/Kannada to English},
  author={Lekhansh Shukla, Prakrithi Shivaprakash},
  year={2025},
  publisher={HuggingFace},
  howpublished={\\url{https://huggingface.co/Lekhansh/gemma-3n-e4b-psychiatry-translation-indic-en}}
}

Contact

For access requests or questions about this model, please contact Dr Lekhansh Shukla @ drlekhansh@gmail.com.


Disclaimer: This model is provided for research purposes only. Users are responsible for ensuring compliance with all applicable laws, regulations, and ethical guidelines when using this model, particularly regarding patient privacy, clinical data handling, and professional translation standards in healthcare settings.

Downloads last month
5
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support