cosmicmicra
/

mathlingua-spec

Model card Files Files and versions

xet

Community

cosmicmicra commited on 11 days ago

Commit

3bc409d

verified ·

1 Parent(s): 6ecd5b7

Add system architecture document

Browse files

Files changed (1) hide show

system_architecture.md +925 -0

system_architecture.md ADDED Viewed

	@@ -0,0 +1,925 @@

+# MathLingua — System Architecture Document
+## 1. System Overview
+MathLingua is a bilingual adaptive math tutoring application for Spanish-speaking students (grades 6–8) transitioning to English-medium mathematics education. The system presents math word problems with 4 scaffolded hint levels and uses a hybrid adaptive algorithm to personalize difficulty progression.
+```
+┌─────────────────────────────────────────────────────────────────────┐
+│                        MathLingua System                            │
+│                                                                     │
+│  ┌──────────────┐   ┌──────────────┐   ┌──────────────────────┐    │
+│  │   Frontend    │   │   Backend    │   │   External Services  │    │
+│  │  (Next.js)   │◄─►│  (Firebase)  │◄─►│  (LLM / SLM)        │    │
+│  └──────┬───────┘   └──────┬───────┘   └──────────┬───────────┘    │
+│         │                  │                       │                │
+│         ▼                  ▼                       ▼                │
+│  ┌──────────────┐   ┌──────────────┐   ┌──────────────────────┐    │
+│  │  Adaptive    │   │  Firestore   │   │  V1: Gemini API      │    │
+│  │  Engine      │   │  Database    │   │  V2: Qwen2.5-3B SLM  │    │
+│  │  (Client JS) │   │              │   │  (HF Inference EP)   │    │
+│  └──────────────┘   └──────────────┘   └──────────────────────┘    │
+└─────────────────────────────────────────────────────────────────────┘
+```
+---
+## 2. Component Architecture
+### 2.1 Frontend — React / Next.js Application
+**Technology**: Next.js 14+ (App Router), TypeScript, Tailwind CSS
+**Hosting**: Firebase Hosting or Vercel
+#### Key Pages/Routes
+| Route | Component | Purpose |
+|---|---|---|
+| `/` | `LandingPage` | Login/signup, language preference |
+| `/dashboard` | `StudentDashboard` | Progress overview, session history, MCS/LDS charts |
+| `/practice` | `PracticeSession` | Adaptive practice from question database |
+| `/solve` | `CustomProblem` | "Input your question" — Gemini/SLM processes user-submitted problems |
+| `/session-report` | `SessionReport` | End-of-session summary with performance analytics |
+#### Core Frontend Components
+```
+src/
+├── components/
+│   ├── ProblemDisplay/
+│   │   ├── MathProblem.tsx          # Renders word problem text
+│   │   ├── HintScaffold.tsx         # L1/L2/L3/L4 progressive hint UI
+│   │   ├── AnswerInput.tsx          # Numeric/expression answer entry
+│   │   └── SolutionReveal.tsx       # L4 step-by-step solution display
+│   ├── Adaptive/
+│   │   ├── DifficultyIndicator.tsx  # Visual current-level indicator
+│   │   ├── ProgressBar.tsx          # Session progress (e.g., 7/20)
+│   │   └── SessionTimer.tsx         # Time tracking per problem
+│   ├── Dashboard/
+│   │   ├── EloChart.tsx             # Elo rating over time (Recharts)
+│   │   ├── TopicHeatmap.tsx         # Performance by math topic
+│   │   ├── LDSMCSPanel.tsx          # Language Dependency & Math Confidence
+│   │   └── StreakBadge.tsx          # Gamification elements
+│   └── Shared/
+│       ├── BilingualToggle.tsx      # EN/ES interface language switch
+│       ├── MathRenderer.tsx         # KaTeX for math expressions
+│       └── LoadingSkeleton.tsx
+├── lib/
+│   ├── adaptive-engine.ts           # Elo + BKT + Thompson Sampling (client-side)
+│   ├── feature-engineer.ts          # LDS & MCS computation
+│   ├── firebase.ts                  # Firebase SDK initialization
+│   └── llm-client.ts               # Gemini/SLM API abstraction
+├── hooks/
+│   ├── useAdaptiveSession.ts        # Manages session state + engine calls
+│   ├── useStudentProfile.ts         # Reads/writes Firestore student state
+│   └── useQuestionQueue.ts          # Pre-fetches next batch of questions
+└── types/
+    └── index.ts                     # TypeScript interfaces for all data structures
+```
+#### Hint Scaffold UI Flow
+```
+┌──────────���──────────────────────────┐
+│  Problem displayed in original      │
+│  English at student's current level │
+│                                     │
+│  [Try to solve]  [I need a hint →]  │
+└──────────────────────┬──────────────┘
+                       │ click
+                       ▼
+┌─────────────────────────────────────┐
+│  L1: Simplified English             │
+│  "A store has 24 apples..."         │
+│                                     │
+│  [Got it!]  [Still stuck →]         │
+└──────────────────────┬──────────────┘
+                       │ click
+                       ▼
+┌─────────────────────────────────────┐
+│  L2: Bilingual Keywords Inline      │
+│  "A store has 24 apples (manzanas)" │
+│  "divided equally (dividido         │
+│   igualmente) among 6 boxes"        │
+│                                     │
+│  [Got it!]  [Still stuck →]         │
+└──────────────────────┬──────────────┘
+                       │ click
+                       ▼
+┌─────────────────────────────────────┐
+│  L3: Full Spanish Translation       │
+│  "Una tienda tiene 24 manzanas      │
+│   divididas igualmente entre 6      │
+│   cajas. ¿Cuántas manzanas hay      │
+│   en cada caja?"                    │
+│                                     │
+│  [Got it!]  [Show me the answer →]  │
+└──────────────────────┬──────────────┘
+                       │ click
+                       ▼
+┌─────────────────────────────────────┐
+│  L4: Step-by-Step Solution          │
+│  Step 1: Identify — 24 ÷ 6         │
+│  Step 2: Calculate — 24 ÷ 6 = 4    │
+│  Step 3: Answer — 4 apples per box  │
+│                                     │
+│  [Next Problem →]                   │
+└─────────────────────────────────────┘
+```
+Each hint interaction is logged with timestamp to compute `escalation_speed` and `scaffold_time_ratio` for the LDS formula.
+---
+### 2.2 Adaptive Engine (Client-Side JavaScript)
+The adaptive engine runs **entirely in the browser** — no server round-trip needed for difficulty decisions. This ensures instant feedback and works offline after initial question batch load.
+#### Engine Components
+```
+┌─────────────────────────────────────────────────┐
+│              Adaptive Engine (client-side)        │
+│                                                   │
+│  ┌─────────────┐  ┌──────────┐  ┌────────────┐  │
+│  │  Elo Rating  │  │   BKT    │  │  Thompson  │  │
+│  │   System     │  │  Engine  │  │  Sampler   │  │
+│  │             │  │          │  │            │  │
+│  │ Updates     │  │ P(know)  │  │ Beta prior │  │
+│  │ student &   │  │ per      │  │ per level, │  │
+│  │ question    │  │ topic    │  │ ZPD window │  │
+│  │ ratings     │  │          │  │            │  │
+│  └──────┬──────┘  └────┬─────┘  └─────┬──────┘  │
+│         │              │              │          │
+│         ▼              ▼              ▼          │
+│  ┌───────────────────────────────────────────┐   │
+│  │         Decision Orchestrator             │   │
+│  │                                           │   │
+│  │  Input: weighted_outcome, features        │   │
+│  │  Output: next_level, decision_type        │   │
+│  │         (increase/maintain/decrease)       │   │
+│  └───────────────────────────────────────────┘   │
+└─────────────────────────────────────────────────┘
+```
+#### Elo Update Formula
+```
+weighted_outcome = {
+    no_hint:  1.00 (solved without any scaffold)
+    L1_only:  0.75 (needed simplified English)
+    L2_used:  0.50 (needed bilingual keywords)
+    L3_used:  0.25 (needed full translation)
+    L4_used:  0.00 (needed answer reveal)
+}
+E_student = 1 / (1 + 10^((R_question - R_student) / 400))
+R_student_new = R_student + K × (weighted_outcome - E_student)
+K = 32 (default), increased to 48 for first 10 interactions (cold-start acceleration)
+```
+#### BKT Parameters (per topic)
+| Parameter | Symbol | Default | Description |
+|---|---|---|---|
+| Prior knowledge | P(L₀) | 0.10 | Initial probability student knows topic |
+| Learn rate | P(T) | 0.15 | Probability of learning per opportunity |
+| Slip | P(S) | 0.10 | Probability of incorrect despite knowing |
+| Guess | P(G) | 0.25 | Probability of correct despite not knowing |
+Slip is adjusted based on hint usage:
+```
+P(S)_adjusted = P(S) × (1 + 0.5 × hint_depth_normalized)
+```
+This models the intuition that using more scaffolds means apparent "correctness" is less certain.
+#### Thompson Sampling with ZPD Windowing
+```
+For each candidate level l in ZPD window [current - 2, current + 3]:
+    sample θ_l ~ Beta(α_l, β_l)
+    score_l = θ_l × proximity_bonus(l, target_elo)
+Select level = argmax(score_l)
+proximity_bonus(l, target) = exp(-0.5 × ((elo_l - target) / 100)²)
+```
+ZPD window is asymmetric (+3 upward, -2 downward) to encourage upward progression while preventing catastrophic failure.
+#### Progression Decision Rules
+| Condition | Decision | Action |
+|---|---|---|
+| weighted_outcome ≥ 0.75 AND P(know) ≥ 0.70 | **Increase** | Move up 1 sub-level |
+| weighted_outcome ≥ 0.85 AND streak ≥ 3 | **Skip** | Move up 2 sub-levels |
+| 0.40 ≤ weighted_outcome < 0.75 | **Maintain** | Stay at current level |
+| weighted_outcome < 0.40 OR streak_wrong ≥ 2 | **Decrease** | Move down 1 sub-level |
+| weighted_outcome < 0.25 AND P(know) < 0.30 | **Rapid Decrease** | Move down 2 sub-levels |
+---
+### 2.3 Firebase Backend
+**Services Used**:
+- Firebase Authentication (Google Sign-In, Email/Password)
+- Cloud Firestore (student state, question database, session logs)
+- Cloud Functions (LLM API calls, batch question generation, session reports)
+- Firebase Hosting (static frontend assets)
+#### Firestore Data Model
+```
+firestore/
+├── users/
+│   └── {uid}/
+│       ├── profile: {
+│       │     displayName, email, gradeLevel, preferredLanguage,
+│       │     createdAt, lastActive
+│       │   }
+│       ├── adaptiveState: {
+│       │     currentElo: number,         // e.g., 1050
+│       │     currentLevel: string,       // e.g., "2.1"
+│       │     totalInteractions: number,
+│       │     topicMastery: {             // BKT P(know) per topic
+│       │       "arithmetic": 0.72,
+│       │       "fractions": 0.45,
+│       │       "algebra_basic": 0.31,
+│       │       ...
+│       │     },
+│       │     thompsonPriors: {           // Beta(α,β) per level
+│       │       "1.1": { alpha: 12, beta: 3 },
+│       │       "1.2": { alpha: 8, beta: 5 },
+│       │       ...
+│       │     },
+│       │     featureAverages: {
+│       │       avgLDS: 0.42,
+│       │       avgMCS: 0.61,
+│       │       recentLDS_5: [0.3, 0.4, 0.5, 0.35, 0.45],
+│       │       recentMCS_5: [0.6, 0.65, 0.58, 0.62, 0.7]
+│       │     },
+│       │     streakCount: number,
+│       │     lastUpdated: timestamp
+│       │   }
+│       └── sessions/
+│           └── {sessionId}/
+│               ├── metadata: {
+│               │     startTime, endTime, questionsAttempted,
+│               │     questionsCorrect, avgWeightedOutcome,
+│               │     startElo, endElo, sessionLDS, sessionMCS
+│               │   }
+│               └── interactions/
+│                   └── {interactionId}: {
+│                         questionId, level, topic,
+│                         startTime, endTime, timeSpentMs,
+│                         hintsUsed: [0,1,2,3,4],  // which levels accessed
+│                         hintTimestamps: { L1: ts, L2: ts, ... },
+│                         maxHintLevel: number,
+│                         answer: string,
+│                         isCorrect: boolean,
+│                         attempts: number,
+│                         weightedOutcome: number,
+│                         lds: number,
+│                         mcs: number,
+│                         eloBeforeUpdate: number,
+│                         eloAfterUpdate: number,
+│                         adaptiveDecision: string
+│                       }
+│
+├── questions/
+│   └── {questionId}: {
+│         id, level, topic, subtopic,
+│         problemText, answer, answerNumeric,
+│         solutionSteps: [...],
+│         scaffolds: {
+│           L1_simplified: string,
+│           L2_bilingual: string,
+│           L3_spanish: string,
+│           L4_solution: string
+│         },
+│         readability: {
+│           fleschKincaid: number,
+│           wordCount: number,
+│           difficultWords: number,
+│           avgSyllables: number
+│         },
+│         eloRating: number,
+│         timesServed: number,
+│         avgOutcome: number,
+│         metadata: {
+│           source: "curated" | "generated",
+│           generatedBy: "gemini-2.0" | "qwen2.5-3b" | null,
+│           reviewedBy: string | null,
+│           createdAt: timestamp
+│         }
+│       }
+│
+├── questionIndex/                      // Denormalized for fast queries
+│   └── byLevel/
+│       └── {level}: {
+│             questionIds: [...],
+│             count: number
+│           }
+│
+└── analytics/                          // Aggregated (Cloud Functions)
+    ├── dailyStats/
+    │   └── {date}: { activeUsers, sessionsCompleted, ... }
+    └── cohortProgress/
+        └── {cohortId}: { avgElo, avgLDS, avgMCS, ... }
+```
+#### Firestore Security Rules
+```javascript
+rules_version = '2';
+service cloud.firestore {
+  match /databases/{database}/documents {
+    // Users can only read/write their own data
+    match /users/{uid}/{document=**} {
+      allow read, write: if request.auth != null && request.auth.uid == uid;
+    }
+    // Questions are readable by all authenticated users
+    match /questions/{questionId} {
+      allow read: if request.auth != null;
+      allow write: if false; // Only admin/Cloud Functions
+    }
+    // Question index readable by all authenticated users
+    match /questionIndex/{document=**} {
+      allow read: if request.auth != null;
+      allow write: if false;
+    }
+    // Analytics only accessible by admin
+    match /analytics/{document=**} {
+      allow read, write: if false; // Cloud Functions only
+    }
+  }
+}
+```
+---
+### 2.4 Cloud Functions (Serverless Backend)
+```
+functions/
+├── onUserCreate.ts          # Initialize adaptive state for new user
+├── generateScaffolds.ts     # Call Gemini/SLM to create L1-L4 for a problem
+├── batchGenerateQuestions.ts # Generate next 20 questions for session queue
+├── processCustomProblem.ts  # "Input your question" flow
+├── generateSessionReport.ts # End-of-session analytics
+├── updateQuestionStats.ts   # Update question difficulty from outcomes
+└── scheduledAnalytics.ts    # Daily aggregation (cron-triggered)
+```
+#### Key Cloud Function: `generateScaffolds`
+```typescript
+// Triggered when student submits a custom problem or when
+// pre-generating scaffolds for database questions
+interface ScaffoldRequest {
+  problemText: string;
+  studentGradeLevel: number;
+  currentLDS: number;  // Informs simplification level
+}
+interface ScaffoldResponse {
+  L1_simplified: string;   // Simplified English
+  L2_bilingual: string;    // English with inline Spanish keywords
+  L3_spanish: string;      // Full Spanish translation
+  L4_solution: string;     // Step-by-step solution
+  answer: string;
+  answerNumeric: number;
+}
+// Prompt template for LLM
+const SCAFFOLD_PROMPT = `
+You are a bilingual math tutor helping Spanish-speaking students
+(grades 6-8) learn math in English.
+Given this math word problem:
+"{problemText}"
+Generate 4 scaffold levels:
+**L1 (Simplified English):** Rewrite using shorter sentences,
+simpler vocabulary (grade {adjustedGrade} reading level).
+Keep all math content identical.
+**L2 (Bilingual Keywords):** Take the original problem and add
+Spanish translations in parentheses for key math and context
+vocabulary. Format: "English word (palabra en español)".
+**L3 (Full Spanish Translation):** Translate the complete problem
+to natural, grade-appropriate Spanish. Ensure mathematical
+precision is maintained.
+**L4 (Step-by-Step Solution):** Provide a clear, numbered
+step-by-step solution in English with the final numerical answer.
+Return as JSON with keys: L1_simplified, L2_bilingual, L3_spanish,
+L4_solution, answer, answerNumeric.
+`;
+```
+#### Key Cloud Function: `batchGenerateQuestions`
+```typescript
+// Called when student reaches question 17 of 20 (prefetch trigger)
+// Selects next 20 questions from database based on adaptive state
+export const batchGenerateQuestions = onCall(async (request) => {
+  const { uid } = request.auth;
+  const state = await getAdaptiveState(uid);
+  // Thompson Sampling selects level distribution for next batch
+  const levelDistribution = thompsonSampleBatch(
+    state.thompsonPriors,
+    state.currentLevel,
+    batchSize: 20
+  );
+  // e.g., { "2.1": 5, "2.2": 8, "2.3": 5, "2.4": 2 }
+  // Select questions avoiding recently served ones
+  const recentIds = await getRecentQuestionIds(uid, lookback: 100);
+  const questions = await selectQuestions(
+    levelDistribution,
+    excludeIds: recentIds,
+    topicBalance: state.topicMastery  // Favor weaker topics
+  );
+  // Ensure all questions have scaffolds generated
+  const withScaffolds = await ensureScaffoldsGenerated(questions);
+  return { questions: withScaffolds, sessionBatchId: generateId() };
+});
+```
+---
+### 2.5 LLM Service Layer
+#### V1: Gemini API (Current)
+```
+┌────────────┐     HTTPS/REST     ┌──────────────────┐
+│  Cloud      │ ──────────────────►│  Google Gemini    │
+│  Function   │ ◄──────────────────│  2.0 Flash API    │
+└────────────┘                    └──────────────────┘
+Cost: ~$0.075 per 1M input tokens, ~$0.30 per 1M output tokens
+Latency: 200-800ms per scaffold generation
+Rate limit: 60 RPM (free tier), 1000 RPM (paid)
+```
+#### V2: Qwen2.5-3B SLM (Planned)
+```
+┌────────────┐     HTTPS/REST     ┌──────────────────────────┐
+│  Cloud      │ ──────────────────►│  HF Inference Endpoint   │
+│  Function   │ ◄──────────────────│  Qwen2.5-3B-Instruct     │
+└────────────┘                    │  (QLoRA fine-tuned)       │
+                                  │  GPU: T4 or L4            │
+                                  └──────────────────────────┘
+Cost: ~$0.60/hr (T4) or ~$1.04/hr (L4)
+Latency: 100-400ms per scaffold generation
+Rate limit: Unlimited (dedicated endpoint)
+```
+#### LLM Client Abstraction
+```typescript
+// lib/llm-client.ts — Provider-agnostic interface
+interface LLMProvider {
+  generateScaffolds(problem: string, context: ScaffoldContext): Promise<ScaffoldResponse>;
+  generateQuestion(level: string, topic: string): Promise<QuestionWithScaffolds>;
+  validateAnswer(problem: string, studentAnswer: string, correctAnswer: string): Promise<AnswerValidation>;
+}
+class GeminiProvider implements LLMProvider { ... }
+class QwenSLMProvider implements LLMProvider { ... }
+// Factory with fallback
+function createLLMClient(): LLMProvider {
+  if (config.useSLM && config.slmEndpointAvailable) {
+    return new QwenSLMProvider(config.slmEndpoint);
+  }
+  return new GeminiProvider(config.geminiApiKey);
+}
+```
+---
+### 2.6 SLM Fine-Tuning Pipeline
+```
+┌──────────────┐    ┌──────────────┐    ┌──────────────────┐
+│  Training     │    │  Fine-Tune   │    │  Deploy          │
+│  Data Prep    │───►│  QLoRA SFT   │───►│  HF Inference EP │
+└──────────────┘    └──────────────┘    └──────────────────┘
+Step 1: Collect 2,000-5,000 scaffold examples from Gemini V1 usage
+Step 2: Human review + quality filter → ~1,500 gold examples
+Step 3: QLoRA fine-tune Qwen2.5-3B-Instruct
+Step 4: Evaluate on held-out test set (BLEU, math accuracy, readability)
+Step 5: Deploy to HF Inference Endpoint
+Step 6: Shadow-test alongside Gemini (serve both, compare quality)
+Step 7: Full cutover when SLM matches Gemini quality
+```
+**Fine-tuning Configuration:**
+| Parameter | Value | Rationale |
+|---|---|---|
+| Base model | Qwen2.5-3B-Instruct | Best math+Spanish at 3B scale |
+| Method | QLoRA (4-bit NF4) | Fits single 16GB GPU |
+| LoRA rank (r) | 32 | Balance quality/efficiency for small dataset |
+| LoRA alpha | 64 | Standard 2× rank |
+| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | Full attention + MLP |
+| Learning rate | 2e-4 | Standard for QLoRA |
+| Epochs | 3-5 | Small dataset, monitor val loss |
+| Batch size | 4 (effective 16 with grad accum) | Memory constraint |
+| Max sequence length | 1024 | Sufficient for problem + all 4 scaffolds |
+| Warmup ratio | 0.05 | Short warmup for small dataset |
+---
+## 3. Data Flow Diagrams
+### 3.1 Flow A: "Practice Problems" Mode
+```
+Student clicks "Start Practice"
+         │
+         ▼
+┌─────────────────────────────────┐
+│ 1. Load adaptive state from     │
+│    Firestore (Elo, BKT, priors) │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 2. Thompson Sampling selects    │
+│    next question level          │
+│    (ZPD window: current ±2/+3)  │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 3. Fetch question from Firestore│
+│    by level + topic balancing   │
+│    (avoid recently served)      │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 4. Display problem, start timer │
+│    Student reads and attempts   │
+└────────────┬────────────────────┘
+             │
+    ┌────────┴────────┐
+    │  Needs hints?   │
+    ▼ No              ▼ Yes
+┌─────────┐   ┌───────────────────┐
+│ Submit  │   │ L1 → L2 → L3 → L4│
+│ answer  │   │ (each click logged │
+└────┬────┘   │  with timestamp)   │
+     │        └────────┬───────────┘
+     │                 │
+     ▼                 ▼
+┌─────────────────────────────────┐
+│ 5. Compute weighted_outcome     │
+│    based on correctness + hints │
+│    Compute LDS and MCS          │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 6. Update Elo (student + Q)     │
+│    Update BKT P(know) for topic │
+│    Update Thompson Beta priors  │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 7. Progression decision:        │
+│    increase / maintain / decrease│
+│    Select next level            │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 8. Save interaction to Firestore│
+│    Display "Next Problem"       │
+└────────────┬────────────────────┘
+             │
+             ▼
+    ┌────────┴────────┐
+    │ Q17 of 20?      │
+    ▼ Yes             ▼ No
+┌─────────────┐   ┌──────────┐
+│ Prefetch    │   │ Loop to  │
+│ next batch  │   │ step 2   │
+│ (Cloud Fn)  │   └──────────┘
+└─────────────┘
+             │
+    At Q20:  ▼
+┌─────────────────────────────────┐
+│ 9. Generate session report      │
+│    (Cloud Function)             │
+│    Show summary to student      │
+└─────────────────────────────────┘
+```
+### 3.2 Flow B: "Input Your Question" Mode
+```
+Student types/pastes a math word problem
+         │
+         ▼
+┌─────────────────────────────────┐
+│ 1. Cloud Function:              │
+│    processCustomProblem         │
+│    - Validate it's a math       │
+│      word problem               │
+│    - Extract answer/solution    │
+│    - Call Gemini/SLM to generate│
+│      L1, L2, L3, L4 scaffolds  │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 2. Estimate difficulty level    │
+│    using readability metrics    │
+│    (FK grade, word count, etc.) │
+│    Map to nearest Elo rating    │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 3. Display problem with         │
+│    scaffold buttons active      │
+│    (same UI as Practice mode)   │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 4. Student interacts, solves    │
+│    Same hint tracking as        │
+│    Practice mode                │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 5. Update adaptive state        │
+│    (Elo, BKT, Thompson)         │
+│    Log interaction               │
+└────────────┬────────────────────┘
+             │
+             ▼
+┌─────────────────────────────────┐
+│ 6. Offer: "Try another?" or     │
+│    "Switch to Practice Mode"    │
+│    (where engine auto-selects)  │
+└─────────────────────────────────┘
+```
+---
+## 4. API Contracts
+### 4.1 Client → Cloud Functions
+```typescript
+// POST /generateScaffolds
+interface GenerateScaffoldsRequest {
+  problemText: string;
+  gradeLevel: number;          // 6, 7, or 8
+  currentLDS: number;          // 0.0-1.0, informs simplification
+}
+interface GenerateScaffoldsResponse {
+  scaffolds: {
+    L1_simplified: string;
+    L2_bilingual: string;
+    L3_spanish: string;
+    L4_solution: string;
+  };
+  answer: string;
+  answerNumeric: number;
+  estimatedLevel: string;      // e.g., "2.3"
+  estimatedElo: number;        // e.g., 1100
+  processingTimeMs: number;
+}
+// POST /batchGenerateQuestions
+interface BatchRequest {
+  batchSize: number;           // default 20
+  // Auth token provides uid → adaptive state looked up server-side
+}
+interface BatchResponse {
+  questions: QuestionWithScaffolds[];
+  sessionBatchId: string;
+}
+// POST /submitInteraction
+interface InteractionSubmission {
+  sessionId: string;
+  questionId: string;
+  answer: string;
+  isCorrect: boolean;
+  timeSpentMs: number;
+  hintsUsed: number[];         // [0], [0,1], [0,1,2], etc.
+  hintTimestamps: Record<string, number>;
+  attempts: number;
+}
+interface InteractionResponse {
+  weightedOutcome: number;
+  lds: number;
+  mcs: number;
+  newElo: number;
+  newLevel: string;
+  decision: "increase" | "maintain" | "decrease" | "skip" | "rapid_decrease";
+  nextQuestion: QuestionWithScaffolds;  // Pre-selected
+}
+// POST /generateSessionReport
+interface SessionReportRequest {
+  sessionId: string;
+}
+interface SessionReportResponse {
+  summary: {
+    questionsAttempted: number;
+    questionsCorrect: number;
+    avgWeightedOutcome: number;
+    eloChange: number;
+    topicsStrong: string[];
+    topicsWeak: string[];
+    avgLDS: number;
+    avgMCS: number;
+    languageProgressNote: string;  // Generated text about L2 progress
+  };
+  recommendations: string[];      // e.g., "Focus on fractions vocabulary"
+}
+```
+---
+## 5. Deployment Architecture
+### 5.1 V1 Deployment (MVP)
+```
+┌──────────────────────────────────────────────────────────┐
+│                    Firebase Project                        │
+│                                                            │
+│  ┌─────────────┐  ┌─────────────┐  ┌──────────────────┐  │
+│  │  Firebase    │  │  Cloud       │  │  Cloud           │  │
+│  │  Hosting     │  │  Firestore   │  │  Functions       │  │
+│  │  (Next.js)   │  │  (Database)  │  │  (Node.js 20)   │  │
+│  │              │  │              │  │                  │  │
+│  │  Static +    │  │  Student     │  │  LLM calls       │  │
+│  │  SSR pages   │  │  state,      │  │  Batch gen       │  │
+│  │              │  │  questions,  │  │  Reports         │  │
+│  │              │  │  sessions    │  │                  │  │
+│  └──────────────┘  └──────────────┘  └───────┬──────────┘  │
+│                                              │             │
+└──────────────────────────────────────────────┼─────────────┘
+                                               │
+                                    HTTPS      │
+                                               ▼
+                                  ┌──────────────────┐
+                                  │  Google Gemini    │
+                                  │  2.0 Flash API    │
+                                  └──────────────────┘
+Estimated monthly cost (100 students, 5 sessions/week):
+- Firebase Hosting: Free tier (~$0)
+- Firestore: ~$5/mo (reads/writes within free tier mostly)
+- Cloud Functions: ~$10/mo (invocations + compute)
+- Gemini API: ~$15-25/mo (scaffold generation)
+- Total: ~$30-40/mo
+```
+### 5.2 V2 Deployment (SLM)
+```
+┌──────────────────────────────────────────────────────────┐
+│                    Firebase Project                        │
+│                                                            │
+│  ┌─────────────┐  ┌─────────────┐  ┌──────────────────┐  │
+│  │  Firebase    │  │  Cloud       │  │  Cloud           │  │
+│  │  Hosting     │  │  Firestore   │  │  Functions       │  │
+│  └──────────────┘  └──────────────┘  └───────┬──────────┘  │
+│                                              │             │
+└──────────────────────────────────────────────┼─────────────┘
+                                               │
+                              ┌─────────────────┼──────────────┐
+                              │                 │              │
+                              ▼                 ▼              │
+                   ┌──────────────────┐  ┌──────────────┐     │
+                   │  HF Inference    │  │  Gemini API  │     │
+                   │  Endpoint        │  │  (fallback)  │     │
+                   │  Qwen2.5-3B     │  └──────────────┘     │
+                   │  QLoRA FT       │                        │
+                   │  (T4 GPU)       │  Shadow testing:       │
+                   └──────────────────┘  Both called, SLM     │
+                                         response served,     │
+                                         Gemini response      │
+                                         logged for QA        │
+                                         ─────────────────────┘
+Estimated monthly cost (100 students):
+- Firebase: ~$15/mo (same as V1)
+- HF Inference Endpoint (T4, scale-to-zero): ~$50-100/mo
+  (active only during school hours, ~8hrs/day × 20 days)
+- Gemini fallback: ~$5/mo (only when SLM is cold)
+- Total: ~$70-120/mo (but no per-token costs at scale)
+```
+### 5.3 V3 Deployment (Scale)
+```
+When student count exceeds 500+, migrate to:
+┌─────────────────────────────────────────────────────────────┐
+│                                                             │
+│  ┌──────────────┐  ┌──────────────┐  ┌──────────────────┐  │
+│  │  Vercel       │  │  Firebase    │  │  Cloud Run       │  │
+│  │  (Next.js)    │  │  Firestore   │  │  (API server)    │  │
+│  └──────────────┘  └──────────────┘  └───────┬──────────┘  │
+│                                              │             │
+│                    ┌─────────────────────────┼──────┐      │
+│                    │                         │      │      │
+│                    ▼                         ▼      │      │
+│         ┌──────────────────┐    ┌─────────────────┐ │      │
+│         │  HF Inference EP │    │  IRT/DKT Model  │ │      │
+│         │  Qwen2.5-3B     │    │  Server          │ │      │
+│         │  (Auto-scaling)  │    │  (Python/FastAPI)│ │      │
+│         └──────────────────┘    └─────────────────┘ │      │
+│                                                     │      │
+│  + Deep Knowledge Tracing (DKT) replaces BKT        │      │
+│  + IRT item calibration from pooled student data     │      │
+│  + A/B testing framework for algorithm improvements  │      │
+└─────────────────────────────────────────────────────────────┘
+```
+---
+## 6. Technology Stack Summary
+| Layer | Technology | Justification |
+|---|---|---|
+| Frontend Framework | Next.js 14+ (App Router) | SSR for SEO, React ecosystem, TypeScript |
+| UI Styling | Tailwind CSS + shadcn/ui | Rapid prototyping, consistent design |
+| Math Rendering | KaTeX | Fast client-side LaTeX rendering |
+| Charts | Recharts | React-native charting for dashboards |
+| Authentication | Firebase Auth | Google Sign-In, simple integration |
+| Database | Cloud Firestore | Real-time sync, offline support, serverless |
+| Serverless Functions | Firebase Cloud Functions (Node.js 20) | Low latency, Firebase integration |
+| LLM (V1) | Google Gemini 2.0 Flash | Low cost, fast, good multilingual |
+| SLM (V2) | Qwen2.5-3B-Instruct (QLoRA fine-tuned) | Best math+Spanish at 3B, Apache 2.0 |
+| SLM Hosting | HF Inference Endpoints (T4, scale-to-zero) | Cost-effective, no infra management |
+| Adaptive Engine | Client-side TypeScript | Zero-latency decisions, works offline |
+| State Management | Zustand + Firestore sync | Lightweight, persists across sessions |
+| Testing | Vitest + Playwright | Unit + E2E testing |
+| CI/CD | GitHub Actions | Automated testing + Firebase deploy |
+| Monitoring | Firebase Analytics + Crashlytics | User behavior + error tracking |
+---
+## 7. Security & Privacy Considerations
+### 7.1 Data Protection
+- **COPPA Compliance**: Students are minors (ages 11-14). No personally identifiable information stored beyond email/display name. No third-party tracking.
+- **FERPA Alignment**: Performance data (Elo, LDS, MCS) is associated with uid only. Teachers/admins see aggregate data, never individual student identifiers.
+- **Data Encryption**: Firestore encrypts at rest (AES-256). All API calls over HTTPS/TLS 1.3.
+### 7.2 API Security
+- Firebase Auth tokens required for all Cloud Function calls
+- Gemini/SLM API keys stored in Firebase environment secrets (never client-side)
+- Rate limiting on Cloud Functions to prevent abuse (max 10 scaffold generations per minute per user)
+### 7.3 Content Safety
+- All LLM-generated scaffolds pass through a validation function checking:
+  - Mathematical accuracy (answer matches expected)
+  - Appropriate content (no adult/violent themes)
+  - Language accuracy (Spanish translation verified against expected pattern)
+- Questions from the curated database are pre-reviewed; generated questions flagged for human review
+---
+## 8. Performance Targets
+| Metric | Target | Measurement |
+|---|---|---|
+| Time to first problem display | < 2 seconds | Lighthouse / Firebase Performance |
+| Adaptive decision latency | < 50ms | Client-side (no network) |
+| Scaffold generation (Gemini) | < 1.5 seconds | Cloud Function logs |
+| Scaffold generation (SLM) | < 800ms | HF Inference EP metrics |
+| Batch prefetch trigger → ready | < 5 seconds | 20 questions fetched at Q17 |
+| Offline capability | Full session | After initial batch load |
+| Concurrent users (V1) | 50 | Firebase free/Blaze tier |
+| Concurrent users (V2) | 500+ | HF auto-scaling endpoint |