| # MathLingua β System Architecture Document |
|
|
| ## 1. System Overview |
|
|
| MathLingua is a bilingual adaptive math tutoring application for Spanish-speaking students (grades 6β8) transitioning to English-medium mathematics education. The system presents math word problems with 4 scaffolded hint levels and uses a hybrid adaptive algorithm to personalize difficulty progression. |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β MathLingua System β |
| β β |
| β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β |
| β β Frontend β β Backend β β External Services β β |
| β β (Next.js) ββββΊβ (Firebase) ββββΊβ (LLM / SLM) β β |
| β ββββββββ¬ββββββββ ββββββββ¬ββββββββ ββββββββββββ¬ββββββββββββ β |
| β β β β β |
| β βΌ βΌ βΌ β |
| β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β |
| β β Adaptive β β Firestore β β V1: Gemini API β β |
| β β Engine β β Database β β V2: Qwen2.5-3B SLM β β |
| β β (Client JS) β β β β (HF Inference EP) β β |
| β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββββββ β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| --- |
|
|
| ## 2. Component Architecture |
|
|
| ### 2.1 Frontend β React / Next.js Application |
|
|
| **Technology**: Next.js 14+ (App Router), TypeScript, Tailwind CSS |
| **Hosting**: Firebase Hosting or Vercel |
|
|
| #### Key Pages/Routes |
|
|
| | Route | Component | Purpose | |
| |---|---|---| |
| | `/` | `LandingPage` | Login/signup, language preference | |
| | `/dashboard` | `StudentDashboard` | Progress overview, session history, MCS/LDS charts | |
| | `/practice` | `PracticeSession` | Adaptive practice from question database | |
| | `/solve` | `CustomProblem` | "Input your question" β Gemini/SLM processes user-submitted problems | |
| | `/session-report` | `SessionReport` | End-of-session summary with performance analytics | |
|
|
| #### Core Frontend Components |
|
|
| ``` |
| src/ |
| βββ components/ |
| β βββ ProblemDisplay/ |
| β β βββ MathProblem.tsx # Renders word problem text |
| β β βββ HintScaffold.tsx # L1/L2/L3/L4 progressive hint UI |
| β β βββ AnswerInput.tsx # Numeric/expression answer entry |
| β β βββ SolutionReveal.tsx # L4 step-by-step solution display |
| β βββ Adaptive/ |
| β β βββ DifficultyIndicator.tsx # Visual current-level indicator |
| β β βββ ProgressBar.tsx # Session progress (e.g., 7/20) |
| β β βββ SessionTimer.tsx # Time tracking per problem |
| β βββ Dashboard/ |
| β β βββ EloChart.tsx # Elo rating over time (Recharts) |
| β β βββ TopicHeatmap.tsx # Performance by math topic |
| β β βββ LDSMCSPanel.tsx # Language Dependency & Math Confidence |
| β β βββ StreakBadge.tsx # Gamification elements |
| β βββ Shared/ |
| β βββ BilingualToggle.tsx # EN/ES interface language switch |
| β βββ MathRenderer.tsx # KaTeX for math expressions |
| β βββ LoadingSkeleton.tsx |
| βββ lib/ |
| β βββ adaptive-engine.ts # Elo + BKT + Thompson Sampling (client-side) |
| β βββ feature-engineer.ts # LDS & MCS computation |
| β βββ firebase.ts # Firebase SDK initialization |
| β βββ llm-client.ts # Gemini/SLM API abstraction |
| βββ hooks/ |
| β βββ useAdaptiveSession.ts # Manages session state + engine calls |
| β βββ useStudentProfile.ts # Reads/writes Firestore student state |
| β βββ useQuestionQueue.ts # Pre-fetches next batch of questions |
| βββ types/ |
| βββ index.ts # TypeScript interfaces for all data structures |
| ``` |
|
|
| #### Hint Scaffold UI Flow |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββ |
| β Problem displayed in original β |
| β English at student's current level β |
| β β |
| β [Try to solve] [I need a hint β] β |
| ββββββββββββββββββββββββ¬βββββββββββββββ |
| β click |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββ |
| β L1: Simplified English β |
| β "A store has 24 apples..." β |
| β β |
| β [Got it!] [Still stuck β] β |
| ββββββββββββββββββββββββ¬βββββββββββββββ |
| β click |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββ |
| β L2: Bilingual Keywords Inline β |
| β "A store has 24 apples (manzanas)" β |
| β "divided equally (dividido β |
| β igualmente) among 6 boxes" β |
| β β |
| β [Got it!] [Still stuck β] β |
| ββββββββββββββββββββββββ¬βββββββββββββββ |
| β click |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββ |
| β L3: Full Spanish Translation β |
| β "Una tienda tiene 24 manzanas β |
| β divididas igualmente entre 6 β |
| β cajas. ΒΏCuΓ‘ntas manzanas hay β |
| β en cada caja?" β |
| β β |
| β [Got it!] [Show me the answer β] β |
| ββββββββββββββββββββββββ¬βββββββββββββββ |
| β click |
| βΌ |
| βββββββββββββββββββββββββββββββββββββββ |
| β L4: Step-by-Step Solution β |
| β Step 1: Identify β 24 Γ· 6 β |
| β Step 2: Calculate β 24 Γ· 6 = 4 β |
| β Step 3: Answer β 4 apples per box β |
| β β |
| β [Next Problem β] β |
| βββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| Each hint interaction is logged with timestamp to compute `escalation_speed` and `scaffold_time_ratio` for the LDS formula. |
|
|
| --- |
|
|
| ### 2.2 Adaptive Engine (Client-Side JavaScript) |
|
|
| The adaptive engine runs **entirely in the browser** β no server round-trip needed for difficulty decisions. This ensures instant feedback and works offline after initial question batch load. |
|
|
| #### Engine Components |
|
|
| ``` |
| βββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Adaptive Engine (client-side) β |
| β β |
| β βββββββββββββββ ββββββββββββ ββββββββββββββ β |
| β β Elo Rating β β BKT β β Thompson β β |
| β β System β β Engine β β Sampler β β |
| β β β β β β β β |
| β β Updates β β P(know) β β Beta prior β β |
| β β student & β β per β β per level, β β |
| β β question β β topic β β ZPD window β β |
| β β ratings β β β β β β |
| β ββββββββ¬βββββββ ββββββ¬ββββββ βββββββ¬βββββββ β |
| β β β β β |
| β βΌ βΌ βΌ β |
| β βββββββββββββββββββββββββββββββββββββββββββββ β |
| β β Decision Orchestrator β β |
| β β β β |
| β β Input: weighted_outcome, features β β |
| β β Output: next_level, decision_type β β |
| β β (increase/maintain/decrease) β β |
| β βββββββββββββββββββββββββββββββββββββββββββββ β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| #### Elo Update Formula |
|
|
| ``` |
| weighted_outcome = { |
| no_hint: 1.00 (solved without any scaffold) |
| L1_only: 0.75 (needed simplified English) |
| L2_used: 0.50 (needed bilingual keywords) |
| L3_used: 0.25 (needed full translation) |
| L4_used: 0.00 (needed answer reveal) |
| } |
| |
| E_student = 1 / (1 + 10^((R_question - R_student) / 400)) |
| R_student_new = R_student + K Γ (weighted_outcome - E_student) |
| |
| K = 32 (default), increased to 48 for first 10 interactions (cold-start acceleration) |
| ``` |
|
|
| #### BKT Parameters (per topic) |
|
|
| | Parameter | Symbol | Default | Description | |
| |---|---|---|---| |
| | Prior knowledge | P(Lβ) | 0.10 | Initial probability student knows topic | |
| | Learn rate | P(T) | 0.15 | Probability of learning per opportunity | |
| | Slip | P(S) | 0.10 | Probability of incorrect despite knowing | |
| | Guess | P(G) | 0.25 | Probability of correct despite not knowing | |
|
|
| Slip is adjusted based on hint usage: |
| ``` |
| P(S)_adjusted = P(S) Γ (1 + 0.5 Γ hint_depth_normalized) |
| ``` |
| This models the intuition that using more scaffolds means apparent "correctness" is less certain. |
|
|
| #### Thompson Sampling with ZPD Windowing |
|
|
| ``` |
| For each candidate level l in ZPD window [current - 2, current + 3]: |
| sample ΞΈ_l ~ Beta(Ξ±_l, Ξ²_l) |
| score_l = ΞΈ_l Γ proximity_bonus(l, target_elo) |
| |
| Select level = argmax(score_l) |
| |
| proximity_bonus(l, target) = exp(-0.5 Γ ((elo_l - target) / 100)Β²) |
| ``` |
|
|
| ZPD window is asymmetric (+3 upward, -2 downward) to encourage upward progression while preventing catastrophic failure. |
|
|
| #### Progression Decision Rules |
|
|
| | Condition | Decision | Action | |
| |---|---|---| |
| | weighted_outcome β₯ 0.75 AND P(know) β₯ 0.70 | **Increase** | Move up 1 sub-level | |
| | weighted_outcome β₯ 0.85 AND streak β₯ 3 | **Skip** | Move up 2 sub-levels | |
| | 0.40 β€ weighted_outcome < 0.75 | **Maintain** | Stay at current level | |
| | weighted_outcome < 0.40 OR streak_wrong β₯ 2 | **Decrease** | Move down 1 sub-level | |
| | weighted_outcome < 0.25 AND P(know) < 0.30 | **Rapid Decrease** | Move down 2 sub-levels | |
|
|
| --- |
|
|
| ### 2.3 Firebase Backend |
|
|
| **Services Used**: |
| - Firebase Authentication (Google Sign-In, Email/Password) |
| - Cloud Firestore (student state, question database, session logs) |
| - Cloud Functions (LLM API calls, batch question generation, session reports) |
| - Firebase Hosting (static frontend assets) |
|
|
| #### Firestore Data Model |
|
|
| ``` |
| firestore/ |
| βββ users/ |
| β βββ {uid}/ |
| β βββ profile: { |
| β β displayName, email, gradeLevel, preferredLanguage, |
| β β createdAt, lastActive |
| β β } |
| β βββ adaptiveState: { |
| β β currentElo: number, // e.g., 1050 |
| β β currentLevel: string, // e.g., "2.1" |
| β β totalInteractions: number, |
| β β topicMastery: { // BKT P(know) per topic |
| β β "arithmetic": 0.72, |
| β β "fractions": 0.45, |
| β β "algebra_basic": 0.31, |
| β β ... |
| β β }, |
| β β thompsonPriors: { // Beta(Ξ±,Ξ²) per level |
| β β "1.1": { alpha: 12, beta: 3 }, |
| β β "1.2": { alpha: 8, beta: 5 }, |
| β β ... |
| β β }, |
| β β featureAverages: { |
| β β avgLDS: 0.42, |
| β β avgMCS: 0.61, |
| β β recentLDS_5: [0.3, 0.4, 0.5, 0.35, 0.45], |
| β β recentMCS_5: [0.6, 0.65, 0.58, 0.62, 0.7] |
| β β }, |
| β β streakCount: number, |
| β β lastUpdated: timestamp |
| β β } |
| β βββ sessions/ |
| β βββ {sessionId}/ |
| β βββ metadata: { |
| β β startTime, endTime, questionsAttempted, |
| β β questionsCorrect, avgWeightedOutcome, |
| β β startElo, endElo, sessionLDS, sessionMCS |
| β β } |
| β βββ interactions/ |
| β βββ {interactionId}: { |
| β questionId, level, topic, |
| β startTime, endTime, timeSpentMs, |
| β hintsUsed: [0,1,2,3,4], // which levels accessed |
| β hintTimestamps: { L1: ts, L2: ts, ... }, |
| β maxHintLevel: number, |
| β answer: string, |
| β isCorrect: boolean, |
| β attempts: number, |
| β weightedOutcome: number, |
| β lds: number, |
| β mcs: number, |
| β eloBeforeUpdate: number, |
| β eloAfterUpdate: number, |
| β adaptiveDecision: string |
| β } |
| β |
| βββ questions/ |
| β βββ {questionId}: { |
| β id, level, topic, subtopic, |
| β problemText, answer, answerNumeric, |
| β solutionSteps: [...], |
| β scaffolds: { |
| β L1_simplified: string, |
| β L2_bilingual: string, |
| β L3_spanish: string, |
| β L4_solution: string |
| β }, |
| β readability: { |
| β fleschKincaid: number, |
| β wordCount: number, |
| β difficultWords: number, |
| β avgSyllables: number |
| β }, |
| β eloRating: number, |
| β timesServed: number, |
| β avgOutcome: number, |
| β metadata: { |
| β source: "curated" | "generated", |
| β generatedBy: "gemini-2.0" | "qwen2.5-3b" | null, |
| β reviewedBy: string | null, |
| β createdAt: timestamp |
| β } |
| β } |
| β |
| βββ questionIndex/ // Denormalized for fast queries |
| β βββ byLevel/ |
| β βββ {level}: { |
| β questionIds: [...], |
| β count: number |
| β } |
| β |
| βββ analytics/ // Aggregated (Cloud Functions) |
| βββ dailyStats/ |
| β βββ {date}: { activeUsers, sessionsCompleted, ... } |
| βββ cohortProgress/ |
| βββ {cohortId}: { avgElo, avgLDS, avgMCS, ... } |
| ``` |
|
|
| #### Firestore Security Rules |
|
|
| ```javascript |
| rules_version = '2'; |
| service cloud.firestore { |
| match /databases/{database}/documents { |
| // Users can only read/write their own data |
| match /users/{uid}/{document=**} { |
| allow read, write: if request.auth != null && request.auth.uid == uid; |
| } |
| // Questions are readable by all authenticated users |
| match /questions/{questionId} { |
| allow read: if request.auth != null; |
| allow write: if false; // Only admin/Cloud Functions |
| } |
| // Question index readable by all authenticated users |
| match /questionIndex/{document=**} { |
| allow read: if request.auth != null; |
| allow write: if false; |
| } |
| // Analytics only accessible by admin |
| match /analytics/{document=**} { |
| allow read, write: if false; // Cloud Functions only |
| } |
| } |
| } |
| ``` |
|
|
| --- |
|
|
| ### 2.4 Cloud Functions (Serverless Backend) |
|
|
| ``` |
| functions/ |
| βββ onUserCreate.ts # Initialize adaptive state for new user |
| βββ generateScaffolds.ts # Call Gemini/SLM to create L1-L4 for a problem |
| βββ batchGenerateQuestions.ts # Generate next 20 questions for session queue |
| βββ processCustomProblem.ts # "Input your question" flow |
| βββ generateSessionReport.ts # End-of-session analytics |
| βββ updateQuestionStats.ts # Update question difficulty from outcomes |
| βββ scheduledAnalytics.ts # Daily aggregation (cron-triggered) |
| ``` |
|
|
| #### Key Cloud Function: `generateScaffolds` |
|
|
| ```typescript |
| // Triggered when student submits a custom problem or when |
| // pre-generating scaffolds for database questions |
| |
| interface ScaffoldRequest { |
| problemText: string; |
| studentGradeLevel: number; |
| currentLDS: number; // Informs simplification level |
| } |
| |
| interface ScaffoldResponse { |
| L1_simplified: string; // Simplified English |
| L2_bilingual: string; // English with inline Spanish keywords |
| L3_spanish: string; // Full Spanish translation |
| L4_solution: string; // Step-by-step solution |
| answer: string; |
| answerNumeric: number; |
| } |
| |
| // Prompt template for LLM |
| const SCAFFOLD_PROMPT = ` |
| You are a bilingual math tutor helping Spanish-speaking students |
| (grades 6-8) learn math in English. |
| |
| Given this math word problem: |
| "{problemText}" |
| |
| Generate 4 scaffold levels: |
| |
| **L1 (Simplified English):** Rewrite using shorter sentences, |
| simpler vocabulary (grade {adjustedGrade} reading level). |
| Keep all math content identical. |
| |
| **L2 (Bilingual Keywords):** Take the original problem and add |
| Spanish translations in parentheses for key math and context |
| vocabulary. Format: "English word (palabra en espaΓ±ol)". |
| |
| **L3 (Full Spanish Translation):** Translate the complete problem |
| to natural, grade-appropriate Spanish. Ensure mathematical |
| precision is maintained. |
| |
| **L4 (Step-by-Step Solution):** Provide a clear, numbered |
| step-by-step solution in English with the final numerical answer. |
| |
| Return as JSON with keys: L1_simplified, L2_bilingual, L3_spanish, |
| L4_solution, answer, answerNumeric. |
| `; |
| ``` |
|
|
| #### Key Cloud Function: `batchGenerateQuestions` |
|
|
| ```typescript |
| // Called when student reaches question 17 of 20 (prefetch trigger) |
| // Selects next 20 questions from database based on adaptive state |
| |
| export const batchGenerateQuestions = onCall(async (request) => { |
| const { uid } = request.auth; |
| const state = await getAdaptiveState(uid); |
| |
| // Thompson Sampling selects level distribution for next batch |
| const levelDistribution = thompsonSampleBatch( |
| state.thompsonPriors, |
| state.currentLevel, |
| batchSize: 20 |
| ); |
| // e.g., { "2.1": 5, "2.2": 8, "2.3": 5, "2.4": 2 } |
| |
| // Select questions avoiding recently served ones |
| const recentIds = await getRecentQuestionIds(uid, lookback: 100); |
| const questions = await selectQuestions( |
| levelDistribution, |
| excludeIds: recentIds, |
| topicBalance: state.topicMastery // Favor weaker topics |
| ); |
| |
| // Ensure all questions have scaffolds generated |
| const withScaffolds = await ensureScaffoldsGenerated(questions); |
| |
| return { questions: withScaffolds, sessionBatchId: generateId() }; |
| }); |
| ``` |
|
|
| --- |
|
|
| ### 2.5 LLM Service Layer |
|
|
| #### V1: Gemini API (Current) |
|
|
| ``` |
| ββββββββββββββ HTTPS/REST ββββββββββββββββββββ |
| β Cloud β βββββββββββββββββββΊβ Google Gemini β |
| β Function β ββββββββββββββββββββ 2.0 Flash API β |
| ββββββββββββββ ββββββββββββββββββββ |
| |
| Cost: ~$0.075 per 1M input tokens, ~$0.30 per 1M output tokens |
| Latency: 200-800ms per scaffold generation |
| Rate limit: 60 RPM (free tier), 1000 RPM (paid) |
| ``` |
|
|
| #### V2: Qwen2.5-3B SLM (Planned) |
|
|
| ``` |
| ββββββββββββββ HTTPS/REST ββββββββββββββββββββββββββββ |
| β Cloud β βββββββββββββββββββΊβ HF Inference Endpoint β |
| β Function β ββββββββββββββββββββ Qwen2.5-3B-Instruct β |
| ββββββββββββββ β (QLoRA fine-tuned) β |
| β GPU: T4 or L4 β |
| ββββββββββββββββββββββββββββ |
| |
| Cost: ~$0.60/hr (T4) or ~$1.04/hr (L4) |
| Latency: 100-400ms per scaffold generation |
| Rate limit: Unlimited (dedicated endpoint) |
| ``` |
|
|
| #### LLM Client Abstraction |
|
|
| ```typescript |
| // lib/llm-client.ts β Provider-agnostic interface |
| |
| interface LLMProvider { |
| generateScaffolds(problem: string, context: ScaffoldContext): Promise<ScaffoldResponse>; |
| generateQuestion(level: string, topic: string): Promise<QuestionWithScaffolds>; |
| validateAnswer(problem: string, studentAnswer: string, correctAnswer: string): Promise<AnswerValidation>; |
| } |
| |
| class GeminiProvider implements LLMProvider { ... } |
| class QwenSLMProvider implements LLMProvider { ... } |
| |
| // Factory with fallback |
| function createLLMClient(): LLMProvider { |
| if (config.useSLM && config.slmEndpointAvailable) { |
| return new QwenSLMProvider(config.slmEndpoint); |
| } |
| return new GeminiProvider(config.geminiApiKey); |
| } |
| ``` |
|
|
| --- |
|
|
| ### 2.6 SLM Fine-Tuning Pipeline |
|
|
| ``` |
| ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ |
| β Training β β Fine-Tune β β Deploy β |
| β Data Prep βββββΊβ QLoRA SFT βββββΊβ HF Inference EP β |
| ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ |
| |
| Step 1: Collect 2,000-5,000 scaffold examples from Gemini V1 usage |
| Step 2: Human review + quality filter β ~1,500 gold examples |
| Step 3: QLoRA fine-tune Qwen2.5-3B-Instruct |
| Step 4: Evaluate on held-out test set (BLEU, math accuracy, readability) |
| Step 5: Deploy to HF Inference Endpoint |
| Step 6: Shadow-test alongside Gemini (serve both, compare quality) |
| Step 7: Full cutover when SLM matches Gemini quality |
| ``` |
|
|
| **Fine-tuning Configuration:** |
|
|
| | Parameter | Value | Rationale | |
| |---|---|---| |
| | Base model | Qwen2.5-3B-Instruct | Best math+Spanish at 3B scale | |
| | Method | QLoRA (4-bit NF4) | Fits single 16GB GPU | |
| | LoRA rank (r) | 32 | Balance quality/efficiency for small dataset | |
| | LoRA alpha | 64 | Standard 2Γ rank | |
| | Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | Full attention + MLP | |
| | Learning rate | 2e-4 | Standard for QLoRA | |
| | Epochs | 3-5 | Small dataset, monitor val loss | |
| | Batch size | 4 (effective 16 with grad accum) | Memory constraint | |
| | Max sequence length | 1024 | Sufficient for problem + all 4 scaffolds | |
| | Warmup ratio | 0.05 | Short warmup for small dataset | |
| |
| --- |
| |
| ## 3. Data Flow Diagrams |
| |
| ### 3.1 Flow A: "Practice Problems" Mode |
| |
| ``` |
| Student clicks "Start Practice" |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 1. Load adaptive state from β |
| β Firestore (Elo, BKT, priors) β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 2. Thompson Sampling selects β |
| β next question level β |
| β (ZPD window: current Β±2/+3) β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 3. Fetch question from Firestoreβ |
| β by level + topic balancing β |
| β (avoid recently served) β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 4. Display problem, start timer β |
| β Student reads and attempts β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| ββββββββββ΄βββββββββ |
| β Needs hints? β |
| βΌ No βΌ Yes |
| βββββββββββ βββββββββββββββββββββ |
| β Submit β β L1 β L2 β L3 β L4β |
| β answer β β (each click logged β |
| ββββββ¬βββββ β with timestamp) β |
| β ββββββββββ¬ββββββββββββ |
| β β |
| βΌ βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 5. Compute weighted_outcome β |
| β based on correctness + hints β |
| β Compute LDS and MCS β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 6. Update Elo (student + Q) β |
| β Update BKT P(know) for topic β |
| β Update Thompson Beta priors β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 7. Progression decision: β |
| β increase / maintain / decreaseβ |
| β Select next level β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 8. Save interaction to Firestoreβ |
| β Display "Next Problem" β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| ββββββββββ΄βββββββββ |
| β Q17 of 20? β |
| βΌ Yes βΌ No |
| βββββββββββββββ ββββββββββββ |
| β Prefetch β β Loop to β |
| β next batch β β step 2 β |
| β (Cloud Fn) β ββββββββββββ |
| βββββββββββββββ |
| β |
| At Q20: βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 9. Generate session report β |
| β (Cloud Function) β |
| β Show summary to student β |
| βββββββββββββββββββββββββββββββββββ |
| ``` |
| |
| ### 3.2 Flow B: "Input Your Question" Mode |
|
|
| ``` |
| Student types/pastes a math word problem |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 1. Cloud Function: β |
| β processCustomProblem β |
| β - Validate it's a math β |
| β word problem β |
| β - Extract answer/solution β |
| β - Call Gemini/SLM to generateβ |
| β L1, L2, L3, L4 scaffolds β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 2. Estimate difficulty level β |
| β using readability metrics β |
| β (FK grade, word count, etc.) β |
| β Map to nearest Elo rating β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 3. Display problem with β |
| β scaffold buttons active β |
| β (same UI as Practice mode) β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 4. Student interacts, solves β |
| β Same hint tracking as β |
| β Practice mode β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 5. Update adaptive state β |
| β (Elo, BKT, Thompson) β |
| β Log interaction β |
| ββββββββββββββ¬βββββββββββββββββββββ |
| β |
| βΌ |
| βββββββββββββββββββββββββββββββββββ |
| β 6. Offer: "Try another?" or β |
| β "Switch to Practice Mode" β |
| β (where engine auto-selects) β |
| βββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| --- |
|
|
| ## 4. API Contracts |
|
|
| ### 4.1 Client β Cloud Functions |
|
|
| ```typescript |
| // POST /generateScaffolds |
| interface GenerateScaffoldsRequest { |
| problemText: string; |
| gradeLevel: number; // 6, 7, or 8 |
| currentLDS: number; // 0.0-1.0, informs simplification |
| } |
| interface GenerateScaffoldsResponse { |
| scaffolds: { |
| L1_simplified: string; |
| L2_bilingual: string; |
| L3_spanish: string; |
| L4_solution: string; |
| }; |
| answer: string; |
| answerNumeric: number; |
| estimatedLevel: string; // e.g., "2.3" |
| estimatedElo: number; // e.g., 1100 |
| processingTimeMs: number; |
| } |
| |
| // POST /batchGenerateQuestions |
| interface BatchRequest { |
| batchSize: number; // default 20 |
| // Auth token provides uid β adaptive state looked up server-side |
| } |
| interface BatchResponse { |
| questions: QuestionWithScaffolds[]; |
| sessionBatchId: string; |
| } |
| |
| // POST /submitInteraction |
| interface InteractionSubmission { |
| sessionId: string; |
| questionId: string; |
| answer: string; |
| isCorrect: boolean; |
| timeSpentMs: number; |
| hintsUsed: number[]; // [0], [0,1], [0,1,2], etc. |
| hintTimestamps: Record<string, number>; |
| attempts: number; |
| } |
| interface InteractionResponse { |
| weightedOutcome: number; |
| lds: number; |
| mcs: number; |
| newElo: number; |
| newLevel: string; |
| decision: "increase" | "maintain" | "decrease" | "skip" | "rapid_decrease"; |
| nextQuestion: QuestionWithScaffolds; // Pre-selected |
| } |
| |
| // POST /generateSessionReport |
| interface SessionReportRequest { |
| sessionId: string; |
| } |
| interface SessionReportResponse { |
| summary: { |
| questionsAttempted: number; |
| questionsCorrect: number; |
| avgWeightedOutcome: number; |
| eloChange: number; |
| topicsStrong: string[]; |
| topicsWeak: string[]; |
| avgLDS: number; |
| avgMCS: number; |
| languageProgressNote: string; // Generated text about L2 progress |
| }; |
| recommendations: string[]; // e.g., "Focus on fractions vocabulary" |
| } |
| ``` |
|
|
| --- |
|
|
| ## 5. Deployment Architecture |
|
|
| ### 5.1 V1 Deployment (MVP) |
|
|
| ``` |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Firebase Project β |
| β β |
| β βββββββββββββββ βββββββββββββββ ββββββββββββββββββββ β |
| β β Firebase β β Cloud β β Cloud β β |
| β β Hosting β β Firestore β β Functions β β |
| β β (Next.js) β β (Database) β β (Node.js 20) β β |
| β β β β β β β β |
| β β Static + β β Student β β LLM calls β β |
| β β SSR pages β β state, β β Batch gen β β |
| β β β β questions, β β Reports β β |
| β β β β sessions β β β β |
| β ββββββββββββββββ ββββββββββββββββ βββββββββ¬βββββββββββ β |
| β β β |
| ββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββ |
| β |
| HTTPS β |
| βΌ |
| ββββββββββββββββββββ |
| β Google Gemini β |
| β 2.0 Flash API β |
| ββββββββββββββββββββ |
| |
| Estimated monthly cost (100 students, 5 sessions/week): |
| - Firebase Hosting: Free tier (~$0) |
| - Firestore: ~$5/mo (reads/writes within free tier mostly) |
| - Cloud Functions: ~$10/mo (invocations + compute) |
| - Gemini API: ~$15-25/mo (scaffold generation) |
| - Total: ~$30-40/mo |
| ``` |
|
|
| ### 5.2 V2 Deployment (SLM) |
|
|
| ``` |
| ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β Firebase Project β |
| β β |
| β βββββββββββββββ βββββββββββββββ ββββββββββββββββββββ β |
| β β Firebase β β Cloud β β Cloud β β |
| β β Hosting β β Firestore β β Functions β β |
| β ββββββββββββββββ ββββββββββββββββ βββββββββ¬βββββββββββ β |
| β β β |
| ββββββββββββββββββββββββββββββββββββββββββββββββΌββββββββββββββ |
| β |
| βββββββββββββββββββΌβββββββββββββββ |
| β β β |
| βΌ βΌ β |
| ββββββββββββββββββββ ββββββββββββββββ β |
| β HF Inference β β Gemini API β β |
| β Endpoint β β (fallback) β β |
| β Qwen2.5-3B β ββββββββββββββββ β |
| β QLoRA FT β β |
| β (T4 GPU) β Shadow testing: β |
| ββββββββββββββββββββ Both called, SLM β |
| response served, β |
| Gemini response β |
| logged for QA β |
| ββββββββββββββββββββββ |
| |
| Estimated monthly cost (100 students): |
| - Firebase: ~$15/mo (same as V1) |
| - HF Inference Endpoint (T4, scale-to-zero): ~$50-100/mo |
| (active only during school hours, ~8hrs/day Γ 20 days) |
| - Gemini fallback: ~$5/mo (only when SLM is cold) |
| - Total: ~$70-120/mo (but no per-token costs at scale) |
| ``` |
|
|
| ### 5.3 V3 Deployment (Scale) |
|
|
| ``` |
| When student count exceeds 500+, migrate to: |
| |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| β β |
| β ββββββββββββββββ ββββββββββββββββ ββββββββββββββββββββ β |
| β β Vercel β β Firebase β β Cloud Run β β |
| β β (Next.js) β β Firestore β β (API server) β β |
| β ββββββββββββββββ ββββββββββββββββ βββββββββ¬βββββββββββ β |
| β β β |
| β βββββββββββββββββββββββββββΌβββββββ β |
| β β β β β |
| β βΌ βΌ β β |
| β ββββββββββββββββββββ βββββββββββββββββββ β β |
| β β HF Inference EP β β IRT/DKT Model β β β |
| β β Qwen2.5-3B β β Server β β β |
| β β (Auto-scaling) β β (Python/FastAPI)β β β |
| β ββββββββββββββββββββ βββββββββββββββββββ β β |
| β β β |
| β + Deep Knowledge Tracing (DKT) replaces BKT β β |
| β + IRT item calibration from pooled student data β β |
| β + A/B testing framework for algorithm improvements β β |
| βββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ |
| ``` |
|
|
| --- |
|
|
| ## 6. Technology Stack Summary |
|
|
| | Layer | Technology | Justification | |
| |---|---|---| |
| | Frontend Framework | Next.js 14+ (App Router) | SSR for SEO, React ecosystem, TypeScript | |
| | UI Styling | Tailwind CSS + shadcn/ui | Rapid prototyping, consistent design | |
| | Math Rendering | KaTeX | Fast client-side LaTeX rendering | |
| | Charts | Recharts | React-native charting for dashboards | |
| | Authentication | Firebase Auth | Google Sign-In, simple integration | |
| | Database | Cloud Firestore | Real-time sync, offline support, serverless | |
| | Serverless Functions | Firebase Cloud Functions (Node.js 20) | Low latency, Firebase integration | |
| | LLM (V1) | Google Gemini 2.0 Flash | Low cost, fast, good multilingual | |
| | SLM (V2) | Qwen2.5-3B-Instruct (QLoRA fine-tuned) | Best math+Spanish at 3B, Apache 2.0 | |
| | SLM Hosting | HF Inference Endpoints (T4, scale-to-zero) | Cost-effective, no infra management | |
| | Adaptive Engine | Client-side TypeScript | Zero-latency decisions, works offline | |
| | State Management | Zustand + Firestore sync | Lightweight, persists across sessions | |
| | Testing | Vitest + Playwright | Unit + E2E testing | |
| | CI/CD | GitHub Actions | Automated testing + Firebase deploy | |
| | Monitoring | Firebase Analytics + Crashlytics | User behavior + error tracking | |
|
|
| --- |
|
|
| ## 7. Security & Privacy Considerations |
|
|
| ### 7.1 Data Protection |
| - **COPPA Compliance**: Students are minors (ages 11-14). No personally identifiable information stored beyond email/display name. No third-party tracking. |
| - **FERPA Alignment**: Performance data (Elo, LDS, MCS) is associated with uid only. Teachers/admins see aggregate data, never individual student identifiers. |
| - **Data Encryption**: Firestore encrypts at rest (AES-256). All API calls over HTTPS/TLS 1.3. |
|
|
| ### 7.2 API Security |
| - Firebase Auth tokens required for all Cloud Function calls |
| - Gemini/SLM API keys stored in Firebase environment secrets (never client-side) |
| - Rate limiting on Cloud Functions to prevent abuse (max 10 scaffold generations per minute per user) |
|
|
| ### 7.3 Content Safety |
| - All LLM-generated scaffolds pass through a validation function checking: |
| - Mathematical accuracy (answer matches expected) |
| - Appropriate content (no adult/violent themes) |
| - Language accuracy (Spanish translation verified against expected pattern) |
| - Questions from the curated database are pre-reviewed; generated questions flagged for human review |
|
|
| --- |
|
|
| ## 8. Performance Targets |
|
|
| | Metric | Target | Measurement | |
| |---|---|---| |
| | Time to first problem display | < 2 seconds | Lighthouse / Firebase Performance | |
| | Adaptive decision latency | < 50ms | Client-side (no network) | |
| | Scaffold generation (Gemini) | < 1.5 seconds | Cloud Function logs | |
| | Scaffold generation (SLM) | < 800ms | HF Inference EP metrics | |
| | Batch prefetch trigger β ready | < 5 seconds | 20 questions fetched at Q17 | |
| | Offline capability | Full session | After initial batch load | |
| | Concurrent users (V1) | 50 | Firebase free/Blaze tier | |
| | Concurrent users (V2) | 500+ | HF auto-scaling endpoint | |
|
|