RegulaUAE-1.2B - UAE Rulebook Q&A Assistant - Finetuned LFM2 Model

Model ID: rajeshthangaraj1/uae_rule_book_QA_assistant Base Model: [unsloth/LFM2-1.2B](https://docs.unsloth.ai/)

image/png

📘 Model Overview

RegulaUAE-1.2B is a domain-specific conversational language model fine-tuned to answer questions strictly grounded in the UAE Central Bank Rulebook (Banking Regulations).

The model is designed to support regulatory, compliance, and educational use cases, with a strong focus on reduced hallucination within the UAE banking domain. Tested against CBUAE regulatory queries.

Coverage includes:

  • Capital Adequacy
  • Licensing & Authorization
  • Corporate Governance
  • Risk Management
  • Compliance & Supervisory Frameworks

🔍 Key Characteristics

  • Domain: UAE Central Bank – Banking Regulations
  • Dataset Size: 500,000+ question–answer pairs
  • Language: English (complete rulebook coverage)
  • Precision: bfloat16
  • Task Type: Domain-specific legal & regulatory Q&A
  • Framework: Hugging Face transformers
  • Pipeline: text-generation with chat template support

🎯 Intended Use Cases

Regulatory & Legal Q&A

  • What is the relationship between Decree Law No. (20) of 2018 and Cabinet Decision No. (10) of 2019?
  • What minimum capital ratios are specified under Article (2)?

Compliance & Risk Teams

  • Regulatory validation and internal compliance support

Education & Research

  • Learning UAE banking regulations in a conversational format

AI & FinTech Development

  • Base model for regulation-aware RAG systems

⚠️ Limitations

  • Hallucination Risk: Without retrieval-augmented generation (RAG), the model may generate plausible but incorrect answers in edge cases.
  • Domain Scope: Limited to UAE Central Bank banking regulations only.
  • Numerical Accuracy: Percentages, ratios, and article references should be verified against the official rulebook.

📊 Dataset Creation

Source Data

Publicly available content from the official UAE Central Bank Rulebook:
https://rulebook.centralbank.ae

Preprocessing

  • Scraped and cleaned official rulebook content
  • Segmented into ~65,000+ semantically aligned text chunks
  • Average chunk size: ~500 characters
  • Preserved articles, clauses, and legal definitions

Q&A Generation

Each chunk was used as grounded context to generate question–answer pairs.

Dataset structure:

{
  "context": "Rulebook text chunk",
  "question": "Regulatory question",
  "answer": "Answer grounded in the context"
}

🧪 Example Usage (Transformers)

from transformers import AutoTokenizer, AutoModelForCausalLM
import torch

model_id = "rajeshthangaraj1/uae_rule_book_QA_assistant"

tokenizer = AutoTokenizer.from_pretrained(model_id)
model = AutoModelForCausalLM.from_pretrained(
    model_id,
    device_map="auto",
    torch_dtype=torch.bfloat16
)

messages = [
    {"role": "system", "content":
     "You are an assistant specialized in the UAE Central Bank Rulebook. "
     "Only answer based on the UAE Rulebook. "
     "If the answer is not in the Rulebook, reply 'Not found in UAE Rulebook'."},
    {"role": "user", "content":
     "According to the UAE Central Bank Rulebook – Capital Adequacy Section, "
     "what does Article (2) specify about minimum capital ratios?"}
]

inputs = tokenizer.apply_chat_template(
    messages,
    add_generation_prompt=True,
    return_tensors="pt"
).to(model.device)
inputs.pop("token_type_ids", None)

outputs = model.generate(**inputs, max_new_tokens=128)
answer = tokenizer.decode(outputs[0][inputs["input_ids"].shape[-1]:], skip_special_tokens=True)
print(answer)

🖥️ Example Usage (Gradio)

import gradio as gr

def chat_with_model(message, history):
    # (your chat_with_model function here)
    ...

gr.ChatInterface(fn=chat_with_model, title="UAE Rulebook QA Assistant").launch()

🔧 Technical Details

  • Base Model: LFM2-1.2B
  • Fine-tuning: LoRA adapters
  • Precision: bfloat16
  • Training Stack: Hugging Face transformers + accelerate

🛣️ Roadmap

  • Retrieval-Augmented Generation (RAG) integration
  • Arabic language support
  • Enhanced hallucination reduction and safety controls
  • Productization for compliance-critical environments

✍️ Author: @rajeshthangaraj1
📅 Last Updated: 2026

Downloads last month
32
Safetensors
Model size
1B params
Tensor type
F16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 2 Ask for provider support

Model tree for rajeshthangaraj1/uae_rule_book_QA_assistant

Finetuned
(37)
this model

Dataset used to train rajeshthangaraj1/uae_rule_book_QA_assistant

Space using rajeshthangaraj1/uae_rule_book_QA_assistant 1