scholarmind-architecture / docs /DECOMPOSITION.md
heyingyue's picture
Add query decomposition design for composite questions
c442479 verified
# ScholarMind ๆŸฅ่ฏขๅˆ†่งฃ (Query Decomposition) ่ฎพ่ฎก
> ่งฃๅ†ณๅคๅˆ้—ฎ้ข˜ๆ— ๆณ•ๆ‹†ๅˆ†็š„็ผบ้™ท๏ผŒไธบ Agent ๅขžๅŠ ่‡ช้€‚ๅบ”ๆŸฅ่ฏขๅˆ†่งฃ่ƒฝๅŠ›
---
## ้—ฎ้ข˜ๅˆ†ๆž
**ๅŽŸๆžถๆž„็ผบ้™ท**๏ผš่ทฏ็”ฑ Agent ๅชๅšๅ•ไธ€ๆ„ๅ›พๅˆ†็ฑป (factual/reasoning/global)๏ผŒ้‡ๅˆฐๅคๅˆ้—ฎ้ข˜ๆ—ถไผšๅคฑ่ดฅ๏ผš
```
โŒ ๅŽŸๆžถๆž„ๅค„็†ๆ–นๅผ:
"Compare BERT and GPT-2 on GLUE, explain their attention mechanisms, and who proposed each?"
โ†’ Router ๅˆ†็ฑปไธบ "reasoning" (ๅช็Œœไธ€ไธช็ฑปๅž‹)
โ†’ ๅ•ๆฌกๆฃ€็ดขๆ— ๆณ•่ฆ†็›–ๆ‰€ๆœ‰ๅญ้—ฎ้ข˜
โ†’ ็ญ”ๆกˆไธๅฎŒๆ•ดๆˆ–ๅๅ‘ๆŸไธชๅญ้—ฎ้ข˜
```
```
โœ… ๆ”น่ฟ›ๅŽ:
ๅŒไธ€้—ฎ้ข˜ โ†’ ๅˆ†่งฃไธบ 5-6 ไธช็‹ฌ็ซ‹ๅญ้—ฎ้ข˜ โ†’ ๅนถ่กŒๆฃ€็ดข โ†’ ็ป“ๆžœ่šๅˆ โ†’ ๅฎŒๆ•ด็ญ”ๆกˆ
```
---
## ๆ”น่ฟ›ๅŽ็š„ Agent ๆžถๆž„
```
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ็”จๆˆทๆŸฅ่ฏข โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๅคๆ‚ๅบฆๅˆคๆ–ญ้—จๆŽง โ”‚
โ”‚ (simple/composite)โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ simple โ”‚ composite
โ–ผ โ–ผ
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๅŽŸๆœ‰่ทฏ็”ฑ โ†’ ๆฃ€็ดข โ”‚ โ”‚ ๆŸฅ่ฏขๅˆ†่งฃๅ™จ (LLM) โ”‚
โ”‚ โ†’ ็”Ÿๆˆ โ†’ ่‡ชๆฃ€ โ”‚ โ”‚ RT-RAG 3็ฑปๅž‹ๅˆ†่งฃ: โ”‚
โ”‚ (ๅ•่ทณ/ๅ•้—ฎ้ข˜) โ”‚ โ”‚ PARALLEL / SEQUENTIAL โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ / DIRECT โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ไพ่ต–ๅ›พๆž„ๅปบ โ”‚
โ”‚ ๅนถ่กŒ็ป„: {Q1,Q2,Q5,Q6} โ”‚
โ”‚ ้กบๅบ้“พ: Q3โ†’Q4 โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ๅนถ่กŒ (Send API) โ”‚ ้กบๅบ (้“พๅผ) โ”‚
โ–ผ โ–ผ โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ Q1: BERT GLUE? โ”‚ โ”‚ Q3: attention? โ”‚ โ”‚
โ”‚ โ†’ factualๆฃ€็ดข โ”‚ โ”‚ โ†’ ้œ€่ฆQ1/Q2็ป“ๆžœโ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ โ†’ reasoningๆฃ€็ดขโ”‚ โ”‚
โ”‚ Q2: GPT-2 GLUE?โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ†’ factualๆฃ€็ดข โ”‚ โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ–ผ โ”‚
โ”‚ Q5: BERTไฝœ่€…? โ”‚ โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ” โ”‚
โ”‚ โ†’ factualๆฃ€็ดข โ”‚ โ”‚ Q4: ๅฏนๆฏ”ๅˆ†ๆž โ”‚ โ”‚
โ”œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ค โ”‚ โ†’ ็”จQ3็ญ”ๆกˆ็”Ÿๆˆ โ”‚ โ”‚
โ”‚ Q6: GPT-2ไฝœ่€…? โ”‚ โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚
โ”‚ โ†’ factualๆฃ€็ดข โ”‚ โ”‚ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜ โ”‚ โ”‚
โ”‚ โ”‚ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ดโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ็ป“ๆžœ่šๅˆ + ้‡ๆŽ’ โ”‚
โ”‚ โ”‚
โ”‚ 1. ๅˆๅนถๆ‰€ๆœ‰ๆฃ€็ดขๆ–‡ๆกฃ โ”‚
โ”‚ 2. Cross-encoder้‡ๆŽ’ โ”‚
โ”‚ (ๅฏนๅŽŸๅง‹ๅฎŒๆ•ดquery่ฏ„ๅˆ†) โ”‚
โ”‚ 3. LLM็ปผๅˆ็”Ÿๆˆๆœ€็ปˆ็ญ”ๆกˆ โ”‚
โ”‚ (ๅผ•็”จๅ„ๅญ้—ฎ้ข˜็ญ”ๆกˆ) โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”ฌโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
โ”‚
โ”Œโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ–ผโ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”
โ”‚ ็ญ”ๆกˆ้ชŒ่ฏ (Self-Check) โ”‚
โ”‚ - ๆ‰€ๆœ‰ๅญ้—ฎ้ข˜้ƒฝๅ›ž็ญ”ไบ†? โ”‚
โ”‚ - ๅ„้ƒจๅˆ†ๆœ‰่ฏๆฎๆ”ฏๆ’‘? โ”‚
โ”‚ - ้€ป่พ‘ไธ€่‡ดๆ€งๆฃ€ๆŸฅ โ”‚
โ””โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”€โ”˜
```
---
## ๆ ธๅฟƒๅฎž็Žฐ
### 1. ๅคๆ‚ๅบฆๅˆคๆ–ญ้—จๆŽง
```python
from pydantic import BaseModel, Field
from typing import Literal
class QueryComplexity(BaseModel):
"""ๆŸฅ่ฏขๅคๆ‚ๅบฆๅˆ†ๆž็ป“ๆžœ"""
complexity: Literal["simple", "composite"] = Field(
description="simple=ๅ•ไธ€้—ฎ้ข˜, composite=ๅŒ…ๅซๅคšไธชๅญ้—ฎ้ข˜"
)
reasoning: str = Field(description="ๅˆคๆ–ญไพๆฎ")
COMPLEXITY_PROMPT = """Determine if this academic query is SIMPLE (one focused question)
or COMPOSITE (contains multiple distinct sub-questions that need separate answers).
Signals for COMPOSITE:
- Contains "and also", "ไปฅๅŠ", "ๅฆๅค–", "ๅŒๆ—ถ"
- Asks about multiple different entities/aspects
- Contains multiple question marks
- Mixes different question types (who + what + compare)
Query: {query}
"""
async def assess_complexity(query: str) -> str:
"""ๅฟซ้€Ÿๅˆคๆ–ญๆ˜ฏๅฆ้œ€่ฆๅˆ†่งฃ โ€” ็”จๅฐๆจกๅž‹ๅณๅฏ"""
result = await llm.complete(
COMPLEXITY_PROMPT.format(query=query),
task="routing", # ็”จๆœฌๅœฐๅฐๆจกๅž‹, ๅปถ่ฟŸ<100ms
response_format=QueryComplexity,
)
return result.complexity
```
### 2. ๆŸฅ่ฏขๅˆ†่งฃๅ™จ (RT-RAG ้ฃŽๆ ผ)
```python
from pydantic import BaseModel, Field
from typing import Literal
class SubQuestion(BaseModel):
"""ๅˆ†่งฃๅŽ็š„ๅญ้—ฎ้ข˜"""
id: int
question: str
type: Literal["factual", "reasoning", "global"] = Field(
description="ๅญ้—ฎ้ข˜็ฑปๅž‹, ๅ†ณๅฎšๆฃ€็ดข็ญ–็•ฅ"
)
depends_on: list[int] = Field(
default_factory=list,
description="ไพ่ต–็š„ๅญ้—ฎ้ข˜IDๅˆ—่กจ, ็ฉบ=ๅฏๅนถ่กŒ"
)
class DecomposedQuery(BaseModel):
"""ๅˆ†่งฃ็ป“ๆžœ"""
original_query: str
core_intent: str = Field(description="็”จๆˆทๆœ€็ปˆๆƒณ็Ÿฅ้“ไป€ไนˆ")
known_entities: list[str] = Field(description="ๆ˜Ž็กฎๆๅˆฐ็š„ๅฎžไฝ“")
unknown_entities: list[str] = Field(description="้œ€่ฆๆฃ€็ดขๆ‰่ƒฝ็กฎๅฎš็š„ๅฎžไฝ“")
sub_questions: list[SubQuestion]
execution_plan: Literal["all_parallel", "all_sequential", "mixed"]
DECOMPOSITION_PROMPT = """You are an expert at decomposing complex academic research questions.
Analyze the query and produce a structured decomposition:
1. CORE INTENT: What does the user ultimately want to know?
2. KNOWN ENTITIES: Explicitly mentioned (papers, methods, authors, datasets)
3. UNKNOWN ENTITIES: Things that must be looked up first
4. SUB-QUESTIONS: Break into answerable sub-questions (max 5)
- Each has a TYPE: factual (specific fact), reasoning (needs multi-hop), global (broad overview)
- Each has DEPENDENCIES: list of sub-question IDs it needs answers from (empty = parallel)
Rules:
- ALWAYS keep the original question recoverable from sub-questions
- Each sub-question should be self-contained (answerable independently or with deps)
- Use #N notation for sequential dependencies: "Given that #1 found X, what is..."
- Maximum 5 sub-questions (more โ†’ noise > signal)
Example:
Query: "Compare BERT and GPT-2's performance on GLUE, and explain what attention mechanism they use"
Output:
{{
"original_query": "Compare BERT and GPT-2's performance on GLUE, and explain what attention mechanism they use",
"core_intent": "Understand BERT vs GPT-2 in terms of both performance and architecture",
"known_entities": ["BERT", "GPT-2", "GLUE"],
"unknown_entities": [],
"sub_questions": [
{{"id": 1, "question": "What is BERT's performance on GLUE benchmark?", "type": "factual", "depends_on": []}},
{{"id": 2, "question": "What is GPT-2's performance on GLUE benchmark?", "type": "factual", "depends_on": []}},
{{"id": 3, "question": "How does the attention mechanism in BERT work?", "type": "reasoning", "depends_on": []}},
{{"id": 4, "question": "How does the attention mechanism in GPT-2 work?", "type": "reasoning", "depends_on": []}},
{{"id": 5, "question": "Compare BERT and GPT-2's GLUE results and attention designs", "type": "reasoning", "depends_on": [1,2,3,4]}}
],
"execution_plan": "mixed"
}}
Now decompose:
Query: {query}
"""
async def decompose_query(query: str) -> DecomposedQuery:
"""LLMๅˆ†่งฃๅคๅˆๆŸฅ่ฏข"""
result = await llm.complete(
DECOMPOSITION_PROMPT.format(query=query),
task="extraction", # ็”จๆœฌๅœฐ14Bๆˆ–GPT-4o-mini
response_format=DecomposedQuery,
)
return result
```
### 3. LangGraph ๅนถ่กŒๆ‰ง่กŒ (Send API)
```python
from typing import Annotated, TypedDict
import operator
from langgraph.types import Send
from langgraph.graph import StateGraph, START, END
# ===== ็Šถๆ€ๅฎšไน‰ =====
class DecompState(TypedDict):
"""ๅˆ†่งฃAgentๆ€ป็Šถๆ€"""
original_query: str
decomposition: DecomposedQuery
sub_results: Annotated[list[dict], operator.add] # ๅนถ่กŒ็ป“ๆžœ่šๅˆ
merged_docs: list
final_answer: str
citations: list
confidence: float
class SubQueryWorkerState(TypedDict):
"""ๆฏไธชๅญ้—ฎ้ข˜Worker็š„็Šถๆ€"""
original_query: str
sub_question: SubQuestion
prior_answers: dict # ไพ่ต–็š„ๅ‰ๅบ็ญ”ๆกˆ {id: answer}
sub_result: dict
# ===== ่Š‚็‚นๅฎšไน‰ =====
async def decompose_node(state: DecompState) -> dict:
"""ๅˆ†่งฃๅคๅˆๆŸฅ่ฏข"""
decomposition = await decompose_query(state["original_query"])
return {"decomposition": decomposition}
def fan_out_parallel(state: DecompState) -> list[Send]:
"""Fan-out: ๅนถ่กŒๆดพๅ‘ๆ— ไพ่ต–็š„ๅญ้—ฎ้ข˜"""
parallel_questions = [
sq for sq in state["decomposition"].sub_questions
if not sq.depends_on
]
return [
Send("sub_query_worker", {
"original_query": state["original_query"],
"sub_question": sq,
"prior_answers": {},
})
for sq in parallel_questions
]
async def sub_query_worker(state: SubQueryWorkerState) -> dict:
"""ๅค„็†ๅ•ไธชๅญ้—ฎ้ข˜ โ€” ๅค็”จๅทฒๆœ‰็š„ๆฃ€็ดข็ฎก้“"""
sq = state["sub_question"]
# ๅฆ‚ๆžœๆœ‰ๅ‰ๅบไพ่ต–, ๅฐ†็ญ”ๆกˆๆณจๅ…ฅๆŸฅ่ฏข
query = sq.question
if state["prior_answers"]:
context = "\n".join([
f"Known: {v}" for v in state["prior_answers"].values()
])
query = f"Given: {context}\n\nQuestion: {sq.question}"
# ๅค็”จๅทฒๆœ‰็š„ๆททๅˆๆฃ€็ดขๅ™จ (ๆ นๆฎๅญ้—ฎ้ข˜็ฑปๅž‹่ทฏ็”ฑ)
retrieved = await hybrid_retriever.retrieve(query, mode=sq.type)
# ๅญ้—ฎ้ข˜็บงๅˆซ็š„็ญ”ๆกˆ็”Ÿๆˆ
answer = await generate_sub_answer(query, retrieved)
return {"sub_results": [{
"id": sq.id,
"question": sq.question,
"answer": answer,
"docs": retrieved,
"type": sq.type,
}]}
async def handle_sequential(state: DecompState) -> dict:
"""ๅค„็†ๆœ‰ไพ่ต–็š„้กบๅบๅญ้—ฎ้ข˜"""
decomp = state["decomposition"]
prior_answers = {r["id"]: r["answer"] for r in state["sub_results"]}
# ๆ‰พๅ‡บไพ่ต–ๅทฒๆปก่ถณ็š„ๅญ้—ฎ้ข˜
sequential_qs = [
sq for sq in decomp.sub_questions
if sq.depends_on and all(d in prior_answers for d in sq.depends_on)
and sq.id not in prior_answers # ่ฟ˜ๆฒกๅค„็†่ฟ‡
]
results = []
for sq in sequential_qs:
deps = {d: prior_answers[d] for d in sq.depends_on}
result = await sub_query_worker({
"original_query": state["original_query"],
"sub_question": sq,
"prior_answers": deps,
})
results.extend(result["sub_results"])
prior_answers[sq.id] = result["sub_results"][0]["answer"]
return {"sub_results": results}
async def aggregate_node(state: DecompState) -> dict:
"""่šๅˆๆ‰€ๆœ‰ๅญ็ป“ๆžœ, ็”Ÿๆˆๆœ€็ปˆ็ญ”ๆกˆ"""
# 1. ๅˆๅนถๆ‰€ๆœ‰ๆฃ€็ดขๆ–‡ๆกฃ
all_docs = []
for r in state["sub_results"]:
all_docs.extend(r.get("docs", []))
# 2. ๅฏนๅŽŸๅง‹ๅฎŒๆ•ดquery้‡ๆŽ’ (ๅ…ณ้”ฎ! ้˜ฒๆญขๅญ้—ฎ้ข˜ๆผ‚็งป)
merged_docs = await reranker.rerank(
query=state["original_query"], # ็”จๅŽŸๅง‹query้‡ๆŽ’!
documents=deduplicate(all_docs),
top_k=10,
)
# 3. LLM็ปผๅˆๆ‰€ๆœ‰ๅญ็ญ”ๆกˆ็”Ÿๆˆๆœ€็ปˆ็ญ”ๆกˆ
sub_answers_text = "\n".join([
f"Sub-Q{r['id']}: {r['question']}\nAnswer: {r['answer']}"
for r in sorted(state["sub_results"], key=lambda x: x["id"])
])
SYNTHESIS_PROMPT = f"""Based on the following sub-question answers and source documents,
provide a comprehensive answer to the original question.
Original Question: {state['original_query']}
Sub-question Answers:
{sub_answers_text}
Supporting Documents:
{format_docs(merged_docs[:5])}
Requirements:
- Address ALL parts of the original question
- Cite specific papers [Author, Year]
- If sub-answers conflict, note the disagreement
- Synthesize, don't just concatenate
"""
final_answer = await llm.complete(SYNTHESIS_PROMPT, task="generation")
citations = extract_citations(final_answer, merged_docs)
return {
"merged_docs": merged_docs,
"final_answer": final_answer,
"citations": citations,
}
async def completeness_check(state: DecompState) -> dict:
"""ๆฃ€ๆŸฅๆ˜ฏๅฆๆ‰€ๆœ‰ๅญ้—ฎ้ข˜้ƒฝ่ขซๅ›ž็ญ”"""
expected_ids = {sq.id for sq in state["decomposition"].sub_questions}
answered_ids = {r["id"] for r in state["sub_results"]}
all_answered = expected_ids == answered_ids
# LLM้ชŒ่ฏ็ญ”ๆกˆๅฎŒๆ•ดๆ€ง
CHECK_PROMPT = f"""
Original question: {state['original_query']}
Answer: {state['final_answer']}
Check:
1. Does the answer address ALL parts of the question? (yes/no)
2. Is each claim supported by evidence? (yes/no)
3. Confidence score (0-1)?
Return JSON: {{"complete": bool, "confidence": float, "missing": [...]}}
"""
check = await llm.complete(CHECK_PROMPT, task="routing")
return {"confidence": check["confidence"]}
# ===== ๅ›พ็ป„่ฃ… =====
def build_decomposition_graph():
graph = StateGraph(DecompState)
graph.add_node("decompose", decompose_node)
graph.add_node("sub_query_worker", sub_query_worker)
graph.add_node("handle_sequential", handle_sequential)
graph.add_node("aggregate", aggregate_node)
graph.add_node("completeness_check", completeness_check)
graph.add_edge(START, "decompose")
# ๅˆ†่งฃๅŽ: fan-out ๅนถ่กŒๅญ้—ฎ้ข˜
graph.add_conditional_edges("decompose", fan_out_parallel, ["sub_query_worker"])
# ๅนถ่กŒๅฎŒๆˆๅŽ: ๅค„็†้กบๅบไพ่ต–
graph.add_edge("sub_query_worker", "handle_sequential")
# ้กบๅบๅฎŒๆˆๅŽ: ่šๅˆ
graph.add_edge("handle_sequential", "aggregate")
# ่šๅˆๅŽ: ๅฎŒๆ•ดๆ€งๆฃ€ๆŸฅ
graph.add_edge("aggregate", "completeness_check")
# ๆฃ€ๆŸฅ้€š่ฟ‡โ†’็ป“ๆŸ, ไธ้€š่ฟ‡โ†’่กฅๅ……
graph.add_conditional_edges(
"completeness_check",
lambda s: END if s["confidence"] > 0.8 else "handle_sequential",
)
return graph.compile()
```
### 4. ้›†ๆˆๅˆฐไธป Agent
```python
# ไฟฎๆ”นๅŽŸๆœ‰ Agent ๅ…ฅๅฃ, ๅขžๅŠ ๅคๆ‚ๅบฆ้—จๆŽง
async def main_agent(query: str, session_id: str) -> dict:
"""ScholarMind ไธปๅ…ฅๅฃ โ€” ่‡ช้€‚ๅบ”ๅค„็†็ฎ€ๅ•/ๅคๅˆ้—ฎ้ข˜"""
# Step 1: ๅฟซ้€Ÿๅˆคๆ–ญๅคๆ‚ๅบฆ (<100ms, ๆœฌๅœฐๅฐๆจกๅž‹)
complexity = await assess_complexity(query)
if complexity == "simple":
# ๅŽŸๆœ‰ๅ•้—ฎ้ข˜ๆต็จ‹ (router โ†’ retriever โ†’ generator โ†’ validator)
return await simple_agent.ainvoke({"query": query})
else: # composite
# ๆ–ฐๅขž: ๅˆ†่งฃ โ†’ ๅนถ่กŒๆฃ€็ดข โ†’ ่šๅˆ
return await decomposition_agent.ainvoke({"original_query": query})
```
---
## ๅ…ณ้”ฎ่ฎพ่ฎกๅŽŸๅˆ™
### 1. ๅง‹็ปˆไฟ็•™ๅŽŸๅง‹ๆŸฅ่ฏข
```python
# โœ… ๆญฃ็กฎ: ๆฃ€็ดข้›†ๅˆ = ๅŽŸๅง‹ๆŸฅ่ฏขๆฃ€็ดข โˆช ๅญๆŸฅ่ฏขๆฃ€็ดข
retrieval_queries = [original_query] + sub_questions
# โŒ ้”™่ฏฏ: ๅช็”จๅญๆŸฅ่ฏขๆฃ€็ดข (ไธขๅคฑๆ•ดไฝ“่ฏญไน‰)
retrieval_queries = sub_questions
```
> **ไพๆฎ**: QD่ฎบๆ–‡ (arxiv:2507.00355) ๅฎž้ชŒ่ฏๆ˜Žไฟ็•™ๅŽŸๅง‹ๆŸฅ่ฏขๅฏ้˜ฒๆญข+5%็š„ๆผ‚็งปๆŸๅคฑ
### 2. ้‡ๆŽ’ๆ—ถ็”จๅŽŸๅง‹ๆŸฅ่ฏข่ฏ„ๅˆ†
```python
# โœ… Cross-encoder ๅฏนๅŽŸๅง‹ๅฎŒๆ•ด้—ฎ้ข˜่ฏ„ๅˆ†
reranked = reranker.rerank(query=original_query, docs=all_merged_docs)
# โŒ ๅฏนๅญ้—ฎ้ข˜ๅˆ†ๅˆซ่ฏ„ๅˆ†ๅ†ๅˆๅนถ (ๅ„่‡ชๆœ€ไผ˜โ‰ ๆ•ดไฝ“ๆœ€ไผ˜)
```
### 3. ๆœ€ๅคšๅˆ†่งฃ 5 ไธชๅญ้—ฎ้ข˜
```python
# ่ถ…่ฟ‡5ไธชๅญ้—ฎ้ข˜ๆ—ถ, ๅˆๅนถ็›ธไผผ็š„
if len(sub_questions) > 5:
sub_questions = merge_similar_questions(sub_questions, max_count=5)
```
> **ไพๆฎ**: QD่ฎบๆ–‡ๆถˆ่žๅฎž้ชŒๆ˜พ็คบ >5 ไธชๅญ้—ฎ้ข˜ๅŽๆฃ€็ดขๅ™ชๅฃฐๅผ€ๅง‹่ถ…่ฟ‡ไฟกๆฏๅขž็›Š
### 4. ้กบๅบไพ่ต–็”จ #N ๅผ•็”จ
```python
# ๅˆ†่งฃ็ป“ๆžœไธญ็š„ไพ่ต–่กจ็คบ:
# Q1: "What dataset did BERT use?" (parallel)
# Q2: "What is the size of #1?" (sequential, depends on Q1's answer)
# ๆ‰ง่กŒๆ—ถ: ๅ…ˆQ1, ๅพ—ๅˆฐ็ญ”ๆกˆ"BookCorpus+Wikipedia", ๅ†ๆŠŠ็ญ”ๆกˆๆณจๅ…ฅQ2็š„context
```
---
## ็คบไพ‹ๆ‰ง่กŒๆต
### ่พ“ๅ…ฅ
```
"Transformerๆžถๆž„่ฟ‘3ๅนดๆœ‰ๅ“ชไบ›ไธป่ฆๆ”น่ฟ›, ๅ„่‡ช็š„ๆ€ง่ƒฝๆๅ‡ๆ˜ฏๅคšๅฐ‘, ไปฅๅŠๅ“ชไธช็ ”็ฉถ็ป„ๆœ€ๆดป่ทƒ?"
```
### ๅˆ†่งฃ็ป“ๆžœ
```json
{
"core_intent": "ๅ…จ้ขไบ†่งฃTransformer่ฟ‘ๅนดๆ”น่ฟ›ใ€้‡ๅŒ–ๆ•ˆๆžœใ€ไธป่ฆ็ ”็ฉถๅŠ›้‡",
"known_entities": ["Transformer"],
"unknown_entities": ["ๅ…ทไฝ“ๆ”น่ฟ›ๆ–นๆณ•", "ๆ€ง่ƒฝๆ•ฐๆฎ", "็ ”็ฉถ็ป„"],
"sub_questions": [
{"id": 1, "question": "2022-2025ๅนดTransformerๆžถๆž„็š„ไธป่ฆๆ”น่ฟ›ๆ–นๅ‘ๆœ‰ๅ“ชไบ›?",
"type": "global", "depends_on": []},
{"id": 2, "question": "่ฟ™ไบ›ๆ”น่ฟ›ๆ–นๆณ•ๅ„่‡ชๅœจไป€ไนˆbenchmarkไธŠๅ–ๅพ—ไบ†ๅคšๅฐ‘ๆ€ง่ƒฝๆๅ‡?",
"type": "factual", "depends_on": [1]},
{"id": 3, "question": "ๅ“ชไบ›็ ”็ฉถ็ป„/ๆœบๆž„ๅœจTransformerๆ”น่ฟ›ๆ–น้ขๅ‘่กจไบ†ๆœ€ๅคš่ฎบๆ–‡?",
"type": "factual", "depends_on": []},
{"id": 4, "question": "็ปผๅˆๅฏนๆฏ”่ฟ™ไบ›ๆ”น่ฟ›ๆ–นๅ‘็š„ๅ‘ๅฑ•่ถ‹ๅŠฟๅ’Œๆœชๆฅๆ–นๅ‘",
"type": "global", "depends_on": [1,2,3]}
],
"execution_plan": "mixed"
}
```
### ๆ‰ง่กŒ่ฎกๅˆ’
```
Round 1 (ๅนถ่กŒ):
Q1 โ†’ RAPTOR Level 2-3 (ๅ…จๅฑ€ๆฆ‚่งˆ) + Graph (Methodโ†’IMPROVES_ONโ†’Transformer)
Q3 โ†’ Graph (Authorโ†’AUTHORED_BYโ†’Paperโ†’PROPOSESโ†’Method) + ๅ‘้‡ๆฃ€็ดข
Round 2 (้กบๅบ, ไพ่ต–Q1):
Q2 โ†’ ็”จQ1็š„็ญ”ๆกˆ(ๅ…ทไฝ“ๆ–นๆณ•ๅ)ๅš็ฒพ็กฎๆฃ€็ดข โ†’ Graph (Methodโ†’EVALUATED_ONโ†’Dataset)
Round 3 (้กบๅบ, ไพ่ต–Q1+Q2+Q3):
Q4 โ†’ ็ปผๅˆๅ‰ไธ‰่€…็ญ”ๆกˆ + RAPTOR้ซ˜ๅฑ‚ๆ‘˜่ฆ โ†’ ่ถ‹ๅŠฟๅˆ†ๆž
่šๅˆ: ๅˆๅนถๆ‰€ๆœ‰ๆ–‡ๆกฃ โ†’ ๅฏนๅŽŸๅง‹query้‡ๆŽ’ โ†’ LLM็ปผๅˆ4ไธชๅญ็ญ”ๆกˆ โ†’ ๅฎŒๆ•ดๆŠฅๅ‘Š
```
---
## ๆ€ง่ƒฝๅฝฑๅ“
| ๆŒ‡ๆ ‡ | ็ฎ€ๅ•้—ฎ้ข˜ | ๅคๅˆ้—ฎ้ข˜(ๆ— ๅˆ†่งฃ) | ๅคๅˆ้—ฎ้ข˜(ๆœ‰ๅˆ†่งฃ) |
|------|---------|----------------|----------------|
| ็ญ”ๆกˆๅฎŒๆ•ดๆ€ง | 95% | ~40% (ๅช็ญ”ไธ€้ƒจๅˆ†) | **92%** |
| ๅปถ่ฟŸ | ~1.5s | ~1.5s (ไฝ†็ญ”ๆกˆๅทฎ) | ~3s (ๅคš่ฝฎๆฃ€็ดข) |
| ๆฃ€็ดขๅฌๅ›ž็އ | ้ซ˜ | ไฝŽ (ๅ•ๆŸฅ่ฏขๆ— ๆณ•่ฆ†็›–) | **้ซ˜** (ๅคšๆŸฅ่ฏขๅนถ่กŒ) |
| ็”จๆˆทๆปกๆ„ๅบฆ | ้ซ˜ | ไฝŽ | **้ซ˜** |
> **ๆŠ˜ไธญ**: ๅคๅˆ้—ฎ้ข˜ๅปถ่ฟŸๅขžๅŠ ~1.5s (ๅˆ†่งฃ+ๅคš่ฝฎๆฃ€็ดข), ไฝ†็ญ”ๆกˆ่ดจ้‡ไปŽ40%โ†’92%ๅฎŒๆ•ดๆ€งใ€‚ๅฏ้€š่ฟ‡ๅนถ่กŒๅ’Œ็ผ“ๅญ˜(L2)ไผ˜ๅŒ–ๅปถ่ฟŸใ€‚
---
## ็›ธๅ…ณ่ฎบๆ–‡
| ่ฎบๆ–‡ | ArXiv ID | ๆ ธๅฟƒ่ดก็Œฎ |
|------|---------|---------|
| **RT-RAG** | 2601.11255 | ๆŽจ็†ๆ ‘ๅˆ†่งฃ, F1=64.92 (ๅคš่ทณQA SOTA) |
| **QD+Reranker** | 2507.00355 | ๅˆ†่งฃ+้‡ๆŽ’, MRR@10 +36.7% |
| **IRCoT** | 2212.10509 | ไบค้”™ๆฃ€็ดข+CoT, +21ptๅฌๅ›ž |
| **Self-Ask** | 2210.03350 | Follow-up scaffold, 79.6%ๅ‡†็กฎ็އ |
| **Least-to-Most** | 2205.10625 | ไธค้˜ถๆฎตๅˆ†่งฃโ†’้กบๅบๆฑ‚่งฃ |
| **DecomP** | 2210.02406 | ๆจกๅ—ๅŒ–ๅˆ†่งฃ+ไธ“็”จhandler |
| **Collab-RAG** | 2504.04915 | ๅพฎ่ฐƒ3B SLMๅšๅˆ†่งฃๅ™จ |
| **Bandit QD** | 2510.18633 | Thompson Sampling้€‰ๅญๆŸฅ่ฏข |