zy113 commited on
Commit
a59ae11
·
verified ·
1 Parent(s): ed9b18e

Upload README.md with huggingface_hub

Browse files
Files changed (1) hide show
  1. README.md +104 -0
README.md ADDED
@@ -0,0 +1,104 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ license: cc-by-nc-sa-4.0
3
+ ---
4
+
5
+ # ChronoQA
6
+
7
+ ChronoQA is a **passage-grounded** benchmark that tests whether retrieval-augmented generation (RAG) systems can keep **temporal** and **causal** facts straight when reading long-form narratives (novels, scripts, etc.).
8
+ Instead of giving the entire book to the model, ChronoQA forces a RAG pipeline to *retrieve the right snippets* and reason about evolving characters and event sequences.
9
+
10
+ | | |
11
+ |-------------------------------|------------------------------------|
12
+ | **Instances** | 1,028 question–answer pairs |
13
+ | **Narratives** | 18 public-domain stories |
14
+ | **Reasoning facets** | 8 (causal, character, setting, …) |
15
+ | **Evidence** | Exact byte-offsets for each answer |
16
+ | **Language** | English |
17
+ | **Intended use** | Evaluate/train RAG systems that need chronology & causality |
18
+ | **License (annotations)** | CC-BY-NC-SA-4.0 |
19
+
20
+ ---
21
+
22
+ ## Dataset Description
23
+
24
+ ### Motivation
25
+ Standard RAG pipelines often lose chronological order and collapse every mention of an entity into a single node. ChronoQA highlights the failures that follow. Example:
26
+
27
+ *"Who was jinxing Harry's broom during his **first** Quidditch match?"* – a system that only retrieves early chapters may wrongly answer *Snape* instead of *Quirrell*.
28
+
29
+ ### Source Stories
30
+ All texts come from Project Gutenberg (public domain in the US).
31
+
32
+ | ID | Title | # Q |
33
+ |----|-------|----|
34
+ | 1 | *A Study in Scarlet* | 67 |
35
+ | 2 | *The Hound of the Baskervilles* | 55 |
36
+ | 3 | *Harry Potter and the Chamber of Secrets* | 30 |
37
+ | 4 | *Harry Potter and the Sorcerer's Stone* | 25 |
38
+ | 5 | *Les Misérables* | 72 |
39
+ | 6 | *The Phantom of the Opera* | 70 |
40
+ | 7 | *The Sign of the Four* | 62 |
41
+ | 8 | *The Wonderful Wizard of Oz* | 82 |
42
+ | 9 | *The Adventures of Sherlock Holmes* | 34 |
43
+ | 10 | *Lady Susan* | 88 |
44
+ | 11 | *Dangerous Connections* | 111 |
45
+ | 12 | *The Picture of Dorian Gray* | 27 |
46
+ | 13 | *The Diary of a Nobody* | 39 |
47
+ | 14 | *The Sorrows of Young Werther* | 58 |
48
+ | 15 | *The Mysterious Affair at Styles* | 69 |
49
+ | 16 | *Pride and Prejudice* | 54 |
50
+ | 17 | *The Secret Garden* | 61 |
51
+ | 18 | *Anne of Green Gables* | 24 |
52
+
53
+ ### Reasoning Facets
54
+ 1. **Causal Consistency**
55
+ 2. **Character & Behavioural Consistency**
56
+ 3. **Setting, Environment & Atmosphere**
57
+ 4. **Symbolism, Imagery & Motifs**
58
+ 5. **Thematic, Philosophical & Moral**
59
+ 6. **Narrative & Plot Structure**
60
+ 7. **Social, Cultural & Political**
61
+ 8. **Emotional & Psychological**
62
+
63
+ ---
64
+
65
+ ## Dataset Structure
66
+
67
+ | Field | Type | Description |
68
+ |-------|------|-------------|
69
+ | `story_id` | `string` | ID of the narrative |
70
+ | `question_id` | `int32` | QA index within that story |
71
+ | `category` | `string` | One of the 8 reasoning facets |
72
+ | `query` | `string` | Natural-language question |
73
+ | `ground_truth` | `string` | Gold answer |
74
+ | `passages` | **`sequence` of objects** | Each object contains: <br> • `start_sentence` `string` <br> • `end_sentence` `string` <br> • `start_byte` `int32` <br> • `end_byte` `int32` <br> • `excerpt` `string` |
75
+ | `story_title`* | `string` | Human-readable title (optional, present in processed splits) |
76
+
77
+ \*The raw JSONL released with the paper does **not** include `story_title`; it is added automatically in the hosted HF dataset for convenience.
78
+
79
+ There is a single **all** split (1,028 rows). Create your own train/validation/test splits if needed (e.g. by story or by reasoning facet).
80
+
81
+ ---
82
+
83
+ ## Usage Example
84
+
85
+ ```python
86
+ from datasets import load_dataset
87
+
88
+ ds = load_dataset("your-org/chronoqa", split="all")
89
+ example = ds[0]
90
+
91
+ print("Question:", example["query"])
92
+ print("Answer :", example["ground_truth"])
93
+ print("Evidence:", example["passages"][0]["excerpt"][:300], "…")
94
+ ```
95
+
96
+ ## Citation Information
97
+ ```
98
+ @article{zhang2025respecting,
99
+ title={Respecting Temporal-Causal Consistency: Entity-Event Knowledge Graphs for Retrieval-Augmented Generation},
100
+ author={Zhang, Ze Yu and Li, Zitao and Li, Yaliang and Ding, Bolin and Low, Bryan Kian Hsiang},
101
+ journal={arXiv preprint arXiv:2506.05939},
102
+ year={2025}
103
+ }
104
+ ```