Factuality Detection

Model Summary: Factuality Detection is a LoRA adapter (always active)for ibm-granite/granite-4.0-micro, specifically designed to assess factual correctness by explicitly taking into account contextual passages that may contain contradicting or conflicting information. Rather than assuming contextual consistency, the adapter evaluates LLM-generated responses against one or more context sources and identifies cases where the response conflicts with, misrepresents, or selectively ignores evidence present in those contexts. The adapter is capable of detecting factual inaccuracies in long-form responses composed of multiple atomic units—such as individual facts or claims—while preserving the full generative and reasoning capabilities of the base model. This makes it particularly well suited for scenarios where accurate interpretation of conflicting context is essential for reliable factuality judgments.

Developer: IBM Research
HF Collection: Granite Libraries
Github Repository: https://www.ibm.com/granite/docs/
Release Date: March 18th, 2026
Model type: LoRA adapter for ibm-granite/granite-4.0-micro
License: Apache 2.0
Paper: The Factuality Detection adapter is finetuned to detect factually incorrect responses with respect to conflicting or contradicting contextual information based on the factuality assessment method described in [Marinescu et al. EMNLP 2025] FactReasoner: A Probabilistic Approach for Long Form Factuality Assessment of Large Language Models.

Usage

Intended use: Factuality Detection is a LoRA adapter for IBM’s Granite-4.0-micro model. It enables the granite-4.0-micro model to assess whether a generated long-form response—containing multiple atomic facts or claims—is factually correct with respect to contextual information that may be incomplete, conflicting, or contradictory. This adapter is designed to operate as an integrated component of the Granite inference pipeline. The model is specifically designed to detect factually incorrect responses according to the following definition:

A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

Quickstart Example (LoRA)

import json

from transformers import AutoTokenizer
from vllm import LLM, SamplingParams
from vllm.lora.request import LoRARequest

def parse_output(output) -> str:
    label = None
    output_text = next(iter(output.outputs)).text.strip()

    # Output must be in JSON format
    try:
        data = json.loads(output_text)
        label = data.get("score")
    except json.JSONDecodeError as e:
        print(f"Cannot parse JSON output: {e}")
        label = "FAIL"
    return label

def make_prompt(query: str, response: str, contexts: list, tokenizer):

    factuality_detection_text = """
<guardian>As a judge agent, your role is to help assess whether the provided text meets the given judging criteria, utilizing all available information, including conversations, documents, and tools.

### Criteria: A factually incorrect response occurs when the assistant's message contains one or more factual claims that are unsupported by, inconsistent with, or directly contradicted by the information provided in the documents or context. This includes situations where the assistant: introduces details not grounded in the context, misstates or distorts facts contained within the context, misinterprets the meaning or implications of the context, supplies erroneous or conflicting information relative to the context. Even if only a small portion of the response contains such inaccuracies, the overall message is considered factually incorrect.

### Scoring Schema: If the last assistant's text meets the criteria, return 'yes'; otherwise, return 'no'.
"""

    messages = []    
    messages.append({
        "role": "user",
        "content": query
    })
    messages.append({
        "role": "assistant",
        "content": response
    })

    # Add the quardian (user)
    messages.append({
        "role": "user",
        "content": factuality_detection_text
    })

    # Apply the granite 4.0-micro chat template
    formatted_text = tokenizer.apply_chat_template(
        messages, 
        tokenize=False, 
        add_generation_prompt=True,
        documents=[{"doc_id": "0", "text": "\n\n".join(contexts)}]
    )

    return formatted_text

# Load the model
BASE_PATH = "ibm-granite/granite-4.0-micro"
adapter_repo = "ibm-granite/granitelib-guardian-r1.0"
adapter_subfolder = "factuality-detection/granite-4.0-micro/lora"

# Download adapter to local cache and get path
local_repo = snapshot_download(adapter_repo, allow_patterns=f"{adapter_subfolder}/*")
adapter_path = f"{local_repo}/{adapter_subfolder}"
LORA_PATH = adapter_path

sampling_params = SamplingParams(max_tokens=30, temperature=0.0, seed=42)
lora_request = LoRARequest("adapter1", 1, LORA_PATH)
model = LLM(model=BASE_PATH, tensor_parallel_size=1, gpu_memory_utilization=0.95, dtype="bfloat16", enable_lora=True, max_lora_rank=128)

# Prepare the prompt
question = "Is Ozzy Osbourne still alive?"
response = "Yes, Ozzy Osbourne is alive in 2025 and preparing for another world tour, continuing to amaze fans with his energy and resilience."
contexts = ["Ozzy Osbourne passed away on July 22, 2025, at the age of 76 from a heart attack. He died at his home in Buckinghamshire, England, with contributing conditions including coronary artery disease and Parkinson's disease. His final performance took place earlier that month in Birmingham."]
tokenizer = AutoTokenizer.from_pretrained(BASE_PATH)
prompts = [make_prompt(question, response, contexts, tokenizer)]

# Generate the output
output = model.generate(prompts, sampling_params, lora_request=lora_request)

# Display the output
label = parse_output(output[0])
print(f"# Factually incorrect? : {label}")

Training Details

Factuality Detection is a LoRA adapter finetuned to detect factually incorrect responses with respect to conflicting or contradicting contextual information based on the factuality assessment method described in [Marinescu et al. EMNLP 2025] FactReasoner: A Probabilistic Approach for Long Form Factuality Assessment of Large Language Models.

Training Data

The adapter was trained using synthetic data that was generated from the ELI5-Category dataset that augments long-form explanatory question–answer threads scrapped from the \texttt{r/explainlikeimfive} reddit forum with explicit topical annotations. This resource contains questions in which users request intuitive explanations of complex topics, each assigned by community moderators to one of 12 high-level categories (11 topical domains plus a \textit{Repost} category) and paired with multiple candidate answers and their corresponding upvote scores.

For each question, we deterministically select the answer with the highest number of upvotes as the canonical response. While these answers are generally high quality and well articulated, they are not guaranteed to be factually correct and may include inaccuracies, outdated claims, or speculative statements. To further diversify the dataset, we intentionally introduce factually incorrect responses to user questions. These synthetic responses are generated by prompting a reasonably strong LLM, such as the Mixtral-8x22B-Instruct-v0.1 model. The ratio of synthetic to human-authored responses is maintained at 50%, ensuring a balanced mix of realistic and adversarial content. The final dataset comprises 17,522 instances, uniformly distributed across 12 categories. We further partition the dataset into training (14,017 samples), validation (1,752 samples), and test (1,753 samples) splits.

Evaluation

The adapter was evaluated on the test split of the ELI5 dataset, which consists of 1,753 samples that were not used during training. In addition, we evaluated performance on an out-of-distribution dataset, Biographies (BIO) (see FactReasoner for details). For each response in both datasets, the ground-truth label - Yes (factually incorrect) or No (factually correct) - was assigned using the FactReasoner factuality assessment pipeline (specifically, a response is labeled Yes if its factuality score is below 0.8; otherwise it is labeled No). The reported AUC, F1, Precision, Recall, and Accuracy metrics were computed with respect to these ground-truth labels.

Dataset	AUC	F1	Precision	Recall	Accuracy
ELI5 (`test`)	0.77	0.80	0.81	0.79	0.78
BIO (`OOD`)	0.77	0.79	0.86	0.73	0.77

Guardian Results (LoRA)

Dataset	AUC	F1	Precision	Recall	Accuracy
ELI5 (`test`)	0.71	0.72	0.78	0.67	0.70
BIO (`OOD`)	0.61	0.76	0.67	0.87	0.66

Adapter Configuration

Parameter	LoRA
Base model	ibm-granite/granite-4.0-micro
LoRA rank (r)	64
LoRA alpha	128
Target modules	all linear
Output format	`{"score": "X"}` where X is `Yes` or `No`
Max completion tokens	30
KV cache	Supported

Citation

If you find this detector useful, please cite the following work.

@inproceedings{marinescu2025factreasoner,
  title={FactReasoner: A Probabilistic Approach to Long-Form Factuality Assessment for Large Language Models},
  author={Marinescu, Radu and Bhattacharjya, Debarun and Lee, Junkyu and Tchrakian, Tigran and Cano, Javier Carnerero and Hou, Yufang and Daly, Elizabeth and Pascale, Alessandra},
  booktitle={Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing (EMNLP)},
  year={2025}
}

Infrastructure: We trained the Granite Micro 4.0 Factuality Detection adapter on IBM's Vela cluster using 4 A100 GPUs.

Ethical Considerations: The Granite Micro 4.0 Factuality Detection adapter is primarily fine-tuned on English-only input–output pairs. Although the underlying base model supports multilingual dialogue, the adapter’s performance on non-English tasks may differ from its performance on English. In addition, while the base model has been aligned with safety considerations in mind, the adapter may, in some cases, produce inaccurate, biased, or otherwise unsafe outputs in response to user prompts. It is also important to note that there is no built-in safeguard guaranteeing that the detection output is always correct. As with other generative models, safety assurance relies on offline evaluation procedures (see Evaluation), and while we expect the generated outputs to meet safety standards, this cannot be guaranteed. Finally, this adapter is specifically optimized for the factuality definition described above, and its behavior outside that scope may be limited.

Resources

⭐️ Learn about the latest updates with Granite: https://www.ibm.com/granite
📄 Get started with tutorials, best practices, and prompt engineering advice: https://www.ibm.com/granite/docs/
💡 Learn about the latest Granite learning resources: https://github.com/ibm-granite/granite-guardian/tree/main/cookbooks