contextflow-rl / ARCHITECTURE.md

namish10

Upload ARCHITECTURE.md with huggingface_hub

41db8ed verified 6 days ago

preview code

raw

history blame contribute delete

49.1 kB

ContextFlow Architecture: Complete System Overview

System Vision
High-Level Architecture
Frontend Layer
Backend Layer
Agent Network
Reinforcement Learning Pipeline
Data Flow
API Design
Multi-Modal Detection
Privacy & Security
Deployment Architecture

1. System Vision

ContextFlow is an AI-powered learning intelligence engine that predicts when learners will get confused BEFORE it happens, enabling proactive intervention in educational settings.

Core Problem Solved

Traditional learning systems are reactive - they respond after confusion occurs
ContextFlow is proactive - it predicts confusion and intervenes before disengagement

Key Innovations

Predictive AI - RL-based doubt prediction
Gesture Control - Hands-free learning assistance
Multi-Agent Orchestration - 9 specialized agents working in concert
Privacy-First - Face blur for classroom deployment

2. High-Level Architecture

┌─────────────────────────────────────────────────────────────────────┐
│                          USERS                                         │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐               │
│  │  Students   │  │  Teachers   │  │  Researchers │               │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘               │
└─────────┼─────────────────┼─────────────────┼─────────────────────────┘
          │                 │                 │
          ▼                 ▼                 ▼
┌─────────────────────────────────────────────────────────────────────┐
│                     PRESENTATION LAYER                                │
│  ┌─────────────────────────────────────────────────────────────┐   │
│  │                    React Frontend (Vite)                       │   │
│  │  ┌─────────┐ ┌─────────┐ ┌─────────┐ ┌─────────┐       │   │
│  │  │  Learn  │ │ LLMFlow │ │Gestures │ │ Predict │ ...     │   │
│  │  │   Tab    │ │  Tab    │ │   Tab   │ │   Tab   │         │   │
│  │  └─────────┘ └─────────┘ └─────────┘ └─────────┘         │   │
│  │                                                              │   │
│  │  ┌─────────────────────────────────────────────────────┐    │   │
│  │  │         MediaPipe Camera Feed (Gesture + Face)       │    │   │
│  │  │    ┌──────────┐              ┌──────────┐          │    │   │
│  │  │    │ Hand     │              │ Face     │          │    │   │
│  │  │    │ Detection │              │ Blur     │          │    │   │
│  │  │    └──────────┘              └──────────┘          │    │   │
│  │  └─────────────────────────────────────────────────────┘    │   │
│  └─────────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘
                                    │
                                    │ REST API (JSON)
                                    │ WebSocket (Optional)
                                    ▼
┌─────────────────────────────────────────────────────────────────────┐
│                       BACKEND LAYER (Flask)                          │
│                                                                      │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │                    API Gateway (Flask Blueprints)               │   │
│  │   /api/session/*  /api/predict/*  /api/gesture/*  /api/*     │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                    │                                  │
│                                    ▼                                  │
│  ┌──────────────────────────────────────────────────────────────┐   │
│  │              STUDY ORCHESTRATOR (Central Coordinator)           │   │
│  │   ┌────────────────────────────────────────────────────┐      │   │
│  │   │                   Agent Registry                    │      │   │
│  │   │  DoubtPredictor │ Behavioral │ Gesture │ Recall  │      │   │
│  │   │  KnowledgeGraph │ PeerLearn │ LLMOrch │ Prompt │      │   │
│  │   └────────────────────────────────────────────────────┘      │   │
│  └──────────────────────────────────────────────────────────────┘   │
│                                    │                                  │
│    ┌───────────────┬─────────────┼─────────────┬───────────────┐  │
│    ▼               ▼             ▼             ▼               ▼      │
│  ┌─────┐       ┌─────┐      ┌─────┐      ┌─────┐      ┌─────┐   │
│  │ Q-  │       │Behavioral│   │Gesture│    │Recall│     │LLM  │   │
│  │Network│     │Agent   │    │Agent │     │Agent │     │Orch │   │
│  └─────┘       └─────┘      └─────┘      └─────┘      └─────┘   │
│                                                                      │
└─────────────────────────────────────────────────────────────────────┘
                                    │
                                    ▼
┌─────────────────────────────────────────────────────────────────────┐
│                         DATA LAYER                                    │
│  ┌────────────┐  ┌────────────┐  ┌────────────┐  ┌────────────┐ │
│  │  Checkpoint │  │  Session   │  │  Knowledge │  │   Real     │ │
│  │  (RL Model) │  │  State     │  │  Graph     │  │   Data     │ │
│  │   .pkl      │  │  JSON      │  │  NetworkX  │  │  Collection│ │
│  └────────────┘  └────────────┘  └────────────┘  └────────────┘ │
└─────────────────────────────────────────────────────────────────────┘

3. Frontend Layer

3.1 Technology Stack

Component	Technology	Purpose
Framework	React 18	UI Components
Build Tool	Vite	Fast development
Styling	Tailwind CSS	Responsive design
Icons	Lucide React	Consistent icons
Camera	MediaPipe	Hand/Face detection

3.2 Application Structure

frontend/src/
├── App.jsx              # Main application (9 tabs)
├── main.jsx             # Entry point
├── index.css            # Global styles
├── BrowserLLMLauncher.js  # AI chat launcher
└── MediaPipeProcessor.js # Camera + gesture processing

3.3 Tab Interface

Tab	Purpose
Learn	Dashboard with predictions, reviews, gamification
LLM Flow	Browser-based AI launcher (no API keys)
Gestures	Train custom hand gestures
Predict	RL doubt prediction visualization
Behavior	Behavioral signal tracking
Peer	Social learning insights
Stats	Learning statistics
Gamify	Fish/XP rewards system
Settings	AI provider configuration

3.4 BrowserLLMLauncher.js

Opens AI chats directly in browser without API keys:

// Opens chat.openai.com with pre-filled context
openAIChat(context, model = 'gpt-4') {
  const url = `https://chat.openai.com/?q=${encodeURIComponent(context)}`;
  window.open(url, '_blank');
}

3.5 MediaPipeProcessor.js

Handles real-time camera processing:

┌─────────────────┐
│   Camera Feed   │
└────────┬────────┘
         │
         ▼
┌─────────────────┐    ┌─────────────────┐
│  Hand Landmark  │    │   Face Mesh     │
│   Detection     │    │   Detection     │
│  (21 points)   │    │  (468 points)  │
└────────┬────────┘    └────────┬────────┘
         │                       │
         ▼                       ▼
┌─────────────────┐    ┌─────────────────┐
│ Gesture         │    │ Face Blur       │
│ Recognition     │───▶│ (Privacy)       │
└────────┬────────┘    └─────────────────┘
         │
         ▼
┌─────────────────┐
│  Backend API    │
│  /api/gesture/  │
└─────────────────┘

4. Backend Layer

4.1 Technology Stack

Component	Technology	Purpose
Framework	Flask	REST API
Async	asyncio	Non-blocking I/O
ML	PyTorch	RL model
Data	NumPy	Feature extraction
Graphs	NetworkX	Knowledge graphs
Storage	JSON/SQLite	Session persistence

4.2 Flask Application Structure

backend/
├── run.py                    # Application entry point
├── app/
│   ├── __init__.py          # Flask app factory
│   ├── config.py            # Configuration
│   ├── api/
│   │   ├── __init__.py
│   │   └── main.py          # All API routes (889 lines)
│   └── agents/
│       ├── __init__.py
│       ├── study_orchestrator.py    # Central coordinator
│       ├── doubt_predictor.py       # RL prediction
│       ├── behavioral_agent.py      # Signal processing
│       ├── hand_gesture_agent.py    # MediaPipe integration
│       ├── recall_agent.py          # Spaced repetition
│       ├── knowledge_graph_agent.py # Concept mapping
│       ├── peer_learning_agent.py    # Social learning
│       ├── llm_orchestrator_agent.py # Multi-AI
│       ├── gesture_action_agent.py  # Gesture→Action
│       └── prompt_agent.py          # Prompt templates

4.3 Flask App Factory

def create_app():
    app = Flask(__name__)
    
    # Load config
    app.config.from_object('app.config.Config')
    
    # Register blueprints
    from app.api.main import api
    app.register_blueprint(api, url_prefix='/api')
    
    # Initialize agents
    init_agents()
    
    return app

5. Agent Network

5.1 Agent Overview

┌─────────────────────────────────────────────────────────────┐
│                  STUDY ORCHESTRATOR                          │
│              (Central Coordinator)                             │
│                                                             │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│  │   Doubt     │  │ Behavioral  │  │   Hand     │        │
│  │  Predictor  │◀─│   Agent     │─▶│  Gesture   │        │
│  │   Agent    │  │             │  │   Agent    │        │
│  └──────┬──────┘  └─────────────┘  └──────┬──────┘        │
│         │                                  │                │
│         ▼                                  ▼                │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│  │ Knowledge  │  │   Recall    │  │    LLM      │        │
│  │   Graph    │◀─│   Agent     │─▶│ Orchestrator│        │
│  │   Agent    │  │             │  │             │        │
│  └─────────────┘  └─────────────┘  └─────────────┘        │
│                                                             │
│  ┌─────────────┐  ┌─────────────┐                           │
│  │    Peer    │  │   Gesture  │                           │
│  │  Learning  │  │   Action    │                           │
│  │   Agent    │  │   Mapper    │                           │
│  └─────────────┘  └─────────────┘                           │
└─────────────────────────────────────────────────────────────┘

5.2 StudyOrchestrator (Central Coordinator)

The orchestrator manages the learning lifecycle:

class StudyOrchestrator:
    def __init__(self, user_id: str):
        self.user_id = user_id
        
        # Initialize all agents
        self.doubt_predictor = DoubtPredictorAgent(user_id)
        self.behavioral_agent = BehavioralAgent(user_id)
        self.gesture_agent = HandGestureAgent(user_id)
        self.recall_agent = RecallAgent(user_id)
        self.knowledge_graph = KnowledgeGraphAgent(user_id)
        self.peer_agent = PeerLearningAgent(user_id)
        
        # State management
        self.state = OrchestratorState()

Session Lifecycle:

PRE_LEARNING - Load predictions, check recalls, get peer insights
ACTIVE_LEARNING - Monitor signals, update predictions, capture doubts
REVIEW - Trigger spaced repetition, update knowledge graph
POST_LEARNING - Sync data, update gamification, generate summary

5.3 DoubtPredictorAgent (RL Core)

Predicts confusion before it happens:

class DoubtPredictorAgent:
    def __init__(self, user_id: str, config: dict = None):
        self.user_id = user_id
        self.model = self._load_checkpoint()
        self.feature_extractor = FeatureExtractor()
    
    def predict_doubts(self, context: dict, top_k: int = 5):
        # 1. Extract 64-dim state vector
        state = self.feature_extractor.extract_state(context)
        
        # 2. Get Q-values from RL model
        q_values = self.model.predict(state)
        
        # 3. Return top-k predictions
        return self._format_predictions(q_values, top_k)

5.4 BehavioralAgent

Processes raw behavioral signals:

class BehavioralSignal:
    mouse_hesitation: float      # Pause frequency
    scroll_reversals: int       # Back-and-forth
    time_on_page: float         # Seconds
    eye_tracking: Tuple[float, float]
    click_frequency: int
        
    def calculate_confusion_score(self) -> float:
        # Weighted average of signals
        weights = {
            'hesitation': 0.3,
            'reversals': 0.25,
            'time_on_page': 0.2,
            'tab_switches': 0.15,
            'back_button': 0.1
        }
        return weighted_sum(signals, weights)

5.5 HandGestureAgent

MediaPipe integration for gesture recognition:

Camera Frame
    │
    ▼
┌─────────────────┐
│ MediaPipe Hands │
│  (21 landmarks) │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│ Gesture Template│
│   Matching      │
└────────┬────────┘
         │
         ▼
┌─────────────────┐
│   Confidence    │──▶ Recognized Gesture
│   Score (0-1)   │
└─────────────────┘

Pre-built Gestures:

Gesture	Description
pinch	Thumb + Index
swipe_up	2-finger up
swipe_down	2-finger down
swipe_right	2-finger right
swipe_left	2-finger left
point	Index extended
wave	Open palm wave
thumbs_up	👍 confirmation
thumbs_down	👎 rejection
fist	Closed hand

5.6 RecallAgent

SM-2 based spaced repetition:

class RecallCard:
    front: str          # Question
    back: str           # Answer
    interval: int       # Days until review
    ease_factor: float  # Difficulty (default 2.5)
    repetitions: int   # Successful reviews

def schedule_review(card: RecallCard, quality: int):
    if quality >= 3:  # Correct
        if card.repetitions == 0:
            card.interval = 1
        elif card.repetitions == 1:
            card.interval = 6
        else:
            card.interval *= card.ease_factor
        card.repetitions += 1
    else:  # Incorrect
        card.repetitions = 0
        card.interval = 1
    
    # Update ease factor
    card.ease_factor += (0.1 - (5 - quality) * (0.08 + (5 - quality) * 0.02))
    card.ease_factor = max(1.3, card.ease_factor)

5.7 KnowledgeGraphAgent

Concept mapping with NetworkX:

class KnowledgeGraphAgent:
    def __init__(self, user_id: str):
        self.graph = nx.MultiDiGraph()
    
    def add_doubt_to_graph(self, doubt: dict):
        # Create node
        self.graph.add_node(
            doubt['concept'],
            type='concept',
            topic=doubt['topic'],
            timestamp=datetime.now()
        )
        
        # Connect to prerequisites
        for prereq in doubt.get('prerequisites', []):
            self.graph.add_edge(prereq, doubt['concept'], type='prerequisite')
        
        # Connect to related concepts
        for related in doubt.get('related', []):
            self.graph.add_edge(doubt['concept'], related, type='related')
    
    def find_learning_path(self, from_topic: str, to_topic: str):
        try:
            return nx.shortest_path(self.graph, from_topic, to_topic)
        except nx.NetworkXNoPath:
            return []

5.8 LLMOrchestrator

Multi-provider AI integration:

class LLMOrchestrator:
    SUPPORTED_PROVIDERS = {
        'chatgpt': LLMProvider.CHATGPT,
        'gemini': LLMProvider.GEMINI,
        'claude': LLMProvider.CLAUDE,
        'deepseek': LLMProvider.DEEPSEEK,
        'ollama': LLMProvider.OLLAMA,
        'groq': LLMProvider.GROQ
    }
    
    async def query_parallel(self, request: LLMRequest):
        tasks = []
        for provider in request.providers:
            task = self._query_provider(provider, request)
            tasks.append(task)
        
        # Execute all queries concurrently
        responses = await asyncio.gather(*tasks, return_exceptions=True)
        return [r for r in responses if not isinstance(r, Exception)]

5.9 GestureActionMapper

Maps gestures to system actions:

class GestureAction(Enum):
    QUERY_MULTI_LLM = "query_multi_llm"
    QUERY_CHATGPT = "query_chatgpt"
    QUERY_GEMINI = "query_gemini"
    TRIGGER_RL_LOOP = "trigger_rl_loop"
    CAPTURE_CONTENT = "capture_content"
    PAUSE_SESSION = "pause_session"
    RESUME_SESSION = "resume_session"

class GestureActionMapper:
    def __init__(self):
        self.action_rules = {
            GestureAction.QUERY_MULTI_LLM: {
                "trigger": {"finger_count": 2, "swipe": "right"}
            },
            GestureAction.PAUSE_SESSION: {
                "trigger": {"gesture": "open_palm"}
            },
            GestureAction.RESUME_SESSION: {
                "trigger": {"gesture": "thumbs_up"}
            }
        }

5.10 PeerLearningAgent

Social learning insights:

class PeerLearningAgent:
    def get_peer_insights(self, topic: str):
        # Aggregate insights from "similar" students
        insights = []
        
        # Find students who learned this topic
        similar_students = self._find_similar_students(topic)
        
        for student in similar_students:
            # What confused them?
            insights.extend(student.difficult_concepts)
        
        # Return aggregated insights
        return self._aggregate_insights(insights)

6. Reinforcement Learning Pipeline

6.1 Problem Formulation

State Space (64 dimensions):

┌────────────────────────────────────────────────────────────────┐
│  Topic Embedding (32)  │ Progress │ Confusion (16) │ Gesture (14) │ Time │
│  TF-IDF of topic       │ 0.0-1.0 │ Behavioral    │ Hand        │ 0-1 │
│                        │          │ signals       │ signals     │      │
└────────────────────────────────────────────────────────────────┘

Action Space (10 doubt types):

what_is_backpropagation
why_gradient_descent
how_overfitting_works
explain_regularization
what_loss_function
how_optimization_works
explain_learning_rate
what_regularization
how_batch_norm_works
explain_softmax

Reward Function:

Event	Reward
Correct prediction	+1.0
Helpful explanation	+0.5
Engagement maintained	+0.3
False positive	-0.5
Missed confusion	-1.0

6.2 Q-Network Architecture

class QNetwork(nn.Module):
    def __init__(self, state_dim=64, action_dim=10, hidden_dim=128):
        super().__init__()
        self.fc1 = nn.Linear(state_dim, hidden_dim)  # 64 → 128
        self.fc2 = nn.Linear(hidden_dim, hidden_dim) # 128 → 128
        self.fc3 = nn.Linear(hidden_dim, action_dim) # 128 → 10
    
    def forward(self, x):
        x = F.relu(self.fc1(x))  # ReLU activation
        x = F.relu(self.fc2(x))
        return self.fc3(x)  # Q-values for each action

6.3 Training Algorithm (GRPO)

class DoubtPredictionRL:
    def train(self, epochs=10, batch_size=32):
        for epoch in range(epochs):
            for batch in self.dataloader:
                # 1. Get current Q-values
                q_values = self.q_network(batch.states)
                
                # 2. Compute targets (GRPO-style)
                with torch.no_grad():
                    next_q = self.target_network(batch.next_states).max(1)[0]
                    targets = batch.rewards + self.gamma * next_q * (~batch.dones)
                
                # 3. Compute loss and update
                loss = self.loss_fn(q_values.gather(1, batch.actions), targets)
                loss.backward()
                self.optimizer.step()
            
            # 4. Update target network
            self.update_target_network()
            
            # 5. Decay epsilon (exploration)
            self.epsilon *= self.epsilon_decay

6.4 Feature Extraction

class FeatureExtractor:
    STATE_DIM = 64
    
    def extract_state(self, context: dict) -> np.ndarray:
        # Topic embedding (32 dims)
        topic_emb = self._extract_topic_embedding(context['topic'])
        
        # Progress (1 dim)
        progress = np.array([context['progress']])
        
        # Confusion signals (16 dims)
        confusion = self._extract_confusion_signals(context['confusion_signals'])
        
        # Gesture signals (14 dims)
        gestures = self._extract_gesture_signals(context['gesture_signals'])
        
        # Time spent (1 dim)
        time_spent = np.array([context['time_spent'] / 1800])
        
        # Concatenate
        return np.concatenate([topic_emb, progress, confusion, gestures, time_spent])

7. Data Flow

7.1 Learning Session Flow

┌─────────────────────────────────────────────────────────────────┐
│                        USER STARTS SESSION                         │
└─────────────────────────────────────────────────────────────────┘
                                │
                                ▼
┌─────────────────────────────────────────────────────────────────┐
│                     ORCHESTRATOR.START_SESSION()                   │
│  1. Create new LearningSession                                    │
│  2. Load RL model checkpoint                                      │
│  3. Build learning context                                       │
└─────────────────────────────────────────────────────────────────┘
                                │
                ┌───────────────┼───────────────┐
                ▼               ▼               ▼
        ┌───────────┐   ┌───────────┐   ┌───────────┐
        │   Doubt   │   │ Behavioral│   │   Peer    │
        │ Predictor │   │   Agent   │   │  Learning │
        │           │   │           │   │   Agent   │
        │  Predict  │   │  Analyze  │   │   Get     │
        │  doubts   │   │  signals  │   │  insights │
        └─────┬─────┘   └─────┬─────┘   └─────┬─────┘
              │               │               │
              └───────────────┼───────────────┘
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                  RETURN INITIAL PREDICTIONS                        │
│  - Top 5 predicted doubts                                         │
│  - Pending reviews                                               │
│  - Peer insights                                                 │
└─────────────────────────────────────────────────────────────────┘

7.2 Behavioral Signal Flow

┌─────────────────────────────────────────────────────────────────┐
│                      REAL-TIME SIGNALS                           │
│                                                                  │
│  ┌─────────┐  ┌─────────┐  ┌─────────┐  ┌─────────┐             │
│  │  Mouse  │  │ Scroll  │  │Gesture  │  │  Time   │             │
│  │Movement │  │ Pattern │  │Camera   │  │  On     │             │
│  └────┬────┘  └────┬────┘  └────┬────┘  └────┬────┘             │
└───────┼───────────┼───────────┼───────────┼───────────────────────┘
        │           │           │           │
        └───────────┴─────┬─────┴───────────┘
                          ▼
              ┌───────────────────────┐
              │   BEHAVIORAL AGENT     │
              │                       │
              │ calculate_confusion_   │
              │ score(signals)        │
              │                       │
              │ Returns: 0.0 - 1.0   │
              └───────────┬───────────┘
                          │
                          ▼
              ┌───────────────────────┐
              │   DOUBT PREDICTOR     │
              │                       │
              │ If score > 0.5:       │
              │   Re-predict doubts    │
              │   Trigger intervention│
              │                       │
              └───────────────────────┘

7.3 Gesture-to-Action Flow

┌─────────────────────────────────────────────────────────────────┐
│                         CAMERA FRAME                               │
└─────────────────────────────┬───────────────────────────────────┘
                              │
                              ▼
┌─────────────────────────────────────────────────────────────────┐
│                      MEDIAPIPE PROCESSING                         │
│                                                                  │
│  ┌──────────────────────┐     ┌──────────────────────┐          │
│  │   Hand Landmark      │     │    Face Mesh         │          │
│  │   Detection         │     │    (468 points)      │          │
│  │   (21 points)       │     │                       │          │
│  └──────────┬─────────┘     └──────────┬───────────┘          │
└─────────────┼───────────────────────────┼───────────────────────┘
              │                           │
              ▼                           ▼
┌──────────────────────┐     ┌──────────────────────┐
│  GESTURE TEMPLATE   │     │    FACE BLUR         │
│  MATCHING           │     │    (Privacy)          │
│                     │     │                       │
│  Compare landmarks  │     │  Blur regions with    │
│  to known gestures  │     │  facial keypoints     │
└──────────┬─────────┘     └───────────────────────┘
           │
           ▼
┌──────────────────────┐
│  GESTURE RECOGNIZED  │──▶ Backend /api/gesture/recognize
│                      │
│  {                   │
│    "gesture": "pinch",│
│    "confidence": 0.92│
│  }                   │
└──────────────────────┘
           │
           ▼
┌──────────────────────┐
│ GESTURE ACTION MAPPER │
│                      │
│ pinch ──────────────▶│ TRIGGER_AI_HELP
│ swipe_right ────────▶│ LAUNCH_BROWSER_CHAT
│ open_palm ──────────▶│ PAUSE_SESSION
│ thumbs_up ──────────▶│ MARK_UNDERSTOOD
└──────────────────────┘

8. API Design

8.1 API Structure

Category	Endpoints
Session	`/session/start`, `/session/update`, `/session/end`, `/session/insights`
Prediction	`/predict/doubts`, `/recommendations`
Behavior	`/behavior/track`, `/behavior/heatmap`
Graph	`/graph/add`, `/graph/query`, `/graph/path`
Review	`/review/due`, `/review/complete`, `/review/stats`
Peer	`/peer/insights`, `/peer/doubts`, `/peer/trending`
Gesture	`/gesture/list`, `/gesture/recognize`, `/gesture/training/*`
LLM	`/llm/query`, `/llm/gesture-action`, `/llm/rl/*`

8.2 Session API

# POST /api/session/start
{
    "user_id": "student123",
    "topic": "Machine Learning",
    "subtopic": "Neural Networks"
}

# Response
{
    "session_id": "session_1699999999.123",
    "topic": "Machine Learning",
    "predictions": [
        {
            "doubt": "how_overfitting_works",
            "confidence": 0.85,
            "explanation": "Student showing signs of confusion...",
            "priority": 1
        }
    ],
    "pending_reviews": 5,
    "peer_insights_count": 3
}

8.3 Doubt Prediction API

# POST /api/predict/doubts
{
    "context": {
        "topic": "Neural Networks",
        "progress": 0.5,
        "confusion_signals": 0.7
    }
}

# Response
{
    "predictions": [
        {
            "doubt": "how_overfitting_works",
            "confidence": 0.85,
            "explanation": "...",
            "priority": 1,
            "estimated_time": "10 min",
            "prerequisites": ["regularization", "bias-variance"]
        }
    ]
}

9. Multi-Modal Detection

9.1 Supported Modalities

┌─────────────────────────────────────────────────────────────────┐
│                    MULTI-MODAL FUSION                             │
│                                                                  │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐             │
│  │   Audio     │  │  Biometric  │  │ Behavioral  │             │
│  │             │  │             │  │             │             │
│  │ Speech rate │  │ Heart rate  │  │ Mouse moves │             │
│  │ Hesitations │  │ GSR         │  │ Scroll      │             │
│  │ Pauses      │  │ Eye tracking│  │ Key presses │             │
│  └──────┬──────┘  └──────┬──────┘  └──────┬──────┘             │
│         │                │                │                     │
│         └────────────────┼────────────────┘                     │
│                          ▼                                      │
│              ┌─────────────────────────┐                       │
│              │   WEIGHTED FUSION       │                       │
│              │                         │                       │
│              │ audio_weight:    0.2    │                       │
│              │ biometric_weight: 0.3   │                       │
│              │ behavioral_weight: 0.5  │                       │
│              └───────────┬─────────────┘                       │
│                          │                                      │
│                          ▼                                      │
│              ┌─────────────────────────┐                       │
│              │  UNIFIED CONFUSION     │                       │
│              │       SCORE            │                       │
│              │       0.0 - 1.0       │                       │
│              └─────────────────────────┘                       │
└─────────────────────────────────────────────────────────────────┘

9.2 Feature Extraction by Modality

Audio (7 features):

Speech rate (WPM)
Pause frequency
Pause duration
Pitch variation
Volume level
Hesitation count
Question markers

Biometric (6 features):

Heart rate (BPM)
Heart rate variability
Skin conductance (GSR)
Skin temperature
Eye blink rate
Eye open duration

Behavioral (8 features):

Mouse hesitation
Scroll reversals
Time on page
Click frequency
Back button usage
Tab switches
Copy attempts
Search usage

10. Privacy & Security

10.1 Face Blur Implementation

class FaceBlurProcessor:
    def __init__(self):
        self.face_mesh = mp_face_mesh.FaceMesh(
            static_image_mode=False,
            max_num_faces=1,
            refine_landmarks=True
        )
    
    def blur_face(self, frame):
        # Detect face landmarks
        results = self.face_mesh.process(cv2.cvtColor(frame, cv2.COLOR_BGR2RGB))
        
        if results.multi_face_landmarks:
            # Get face region
            face_region = self._get_face_region(frame, results)
            
            # Apply Gaussian blur
            blurred = cv2.GaussianBlur(face_region, (51, 51), 0)
            
            # Replace face region
            frame = self._replace_region(frame, blurred, results)
        
        return frame

10.2 Data Privacy

Data Type	Storage	Privacy
Video frames	None	Processed in-memory only
Face images	None	Auto-blurred
Hand landmarks	Optional	Anonymized
Session data	Local JSON	User-owned
Model weights	HuggingFace	Open

11. Deployment Architecture

11.1 Development Setup

┌─────────────────────────────────────────────────────────────────┐
│                      DEVELOPMENT                                  │
│                                                                  │
│  Terminal 1:                 Terminal 2:                        │
│  ┌─────────────────┐         ┌─────────────────┐               │
│  │ cd backend      │         │ cd frontend     │               │
│  │ python run.py   │         │ npm run dev     │               │
│  │                 │         │                 │               │
│  │ Flask :5001     │         │ Vite    :5173   │               │
│  └────────┬────────┘         └────────┬────────┘               │
└───────────┼───────────────────────────┼─────────────────────────┘
            │                           │
            │           ┌───────────────┘
            │           │
            ▼           ▼
┌─────────────────────────────────────────────────────────────────┐
│                    BROWSER (localhost)                            │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐  │
│  │  Frontend (:5173) <─────── Proxy ───────> Backend (:5001)│  │
│  └─────────────────────────────────────────────────────────┘  │
└─────────────────────────────────────────────────────────────────┘

11.2 Production Setup

                          ┌─────────────────┐
                          │   Load Balancer  │
                          └────────┬────────┘
                                   │
              ┌────────────────────┼────────────────────┐
              │                    │                    │
              ▼                    ▼                    ▼
      ┌───────────────┐    ┌───────────────┐    ┌───────────────┐
      │  Flask Worker │    │  Flask Worker │    │  Flask Worker │
      │    (:5001)    │    │    (:5001)    │    │    (:5001)    │
      └───────────────┘    └───────────────┘    └───────────────┘
              │                    │                    │
              └────────────────────┼────────────────────┘
                                   │
                                   ▼
                          ┌─────────────────┐
                          │   Redis Cache   │
                          └────────┬────────┘
                                   │
                                   ▼
                          ┌─────────────────┐
                          │   PostgreSQL    │
                          └─────────────────┘

11.3 HuggingFace Model Hosting

┌─────────────────────────────────────────────────────────────────┐
│                     HuggingFace Hub                              │
│                                                                  │
│  ┌─────────────────────────────────────────────────────────┐   │
│  │              namish10/contextflow-rl                     │   │
│  │                                                          │   │
│  │  checkpoint.pkl        ← Trained RL model            │   │
│  │  train_rl.py            ← Training script            │   │
│  │  feature_extractor.py    ← State extraction           │   │
│  │  online_learning.py      ← Continuous learning        │   │
│  │  data_collector.py      ← Real data collection        │   │
│  │  multimodal_detection.py ← Audio/biometric fusion      │   │
│  │  demo.ipynb             ← Interactive demo            │   │
│  │  RESEARCH_PAPER.md      ← Full documentation          │   │
│  │                                                          │   │
│  │  app/ (9 agents + API)                                │   │
│  │  frontend/ (React UI)                                  │   │
│  └─────────────────────────────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────┘

Summary

ContextFlow is a comprehensive system combining:

Predictive AI - RL-based doubt prediction before confusion occurs
Multi-Agent Architecture - 9 specialized agents coordinated by orchestrator
Gesture Recognition - Privacy-first MediaPipe hand detection
Multi-Modal Sensing - Audio + Biometric + Behavioral fusion
Browser-Based AI - Direct AI chat launching without API keys
Continuous Learning - Online learning from user feedback

The system is production-ready with all 9 API endpoints working, complete agent network, and trained RL model available on HuggingFace.