Between Help and Harm: An Evaluation of Mental Health Crisis Handling by LLMs
Paper • 2509.24857 • Published • 1
Datasets for the JMIR Mental Health paper on LLM crisis handling: benchmark labels, model responses, and evaluations.
Note arXiv version of the paper; final DOI allocated as https://doi.org/10.2196/88435.
Note Example-level crisis benchmark with validation/test conversations and human/LLM crisis labels.
Note Response-level model answers, human appropriateness scores, and LLM evaluator judgments.