mathlingua-spec / system_architecture.md
cosmicmicra's picture
Add system architecture document
3bc409d verified
# MathLingua β€” System Architecture Document
## 1. System Overview
MathLingua is a bilingual adaptive math tutoring application for Spanish-speaking students (grades 6–8) transitioning to English-medium mathematics education. The system presents math word problems with 4 scaffolded hint levels and uses a hybrid adaptive algorithm to personalize difficulty progression.
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ MathLingua System β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Frontend β”‚ β”‚ Backend β”‚ β”‚ External Services β”‚ β”‚
β”‚ β”‚ (Next.js) │◄─►│ (Firebase) │◄─►│ (LLM / SLM) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β–Ό β–Ό β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Adaptive β”‚ β”‚ Firestore β”‚ β”‚ V1: Gemini API β”‚ β”‚
β”‚ β”‚ Engine β”‚ β”‚ Database β”‚ β”‚ V2: Qwen2.5-3B SLM β”‚ β”‚
β”‚ β”‚ (Client JS) β”‚ β”‚ β”‚ β”‚ (HF Inference EP) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## 2. Component Architecture
### 2.1 Frontend β€” React / Next.js Application
**Technology**: Next.js 14+ (App Router), TypeScript, Tailwind CSS
**Hosting**: Firebase Hosting or Vercel
#### Key Pages/Routes
| Route | Component | Purpose |
|---|---|---|
| `/` | `LandingPage` | Login/signup, language preference |
| `/dashboard` | `StudentDashboard` | Progress overview, session history, MCS/LDS charts |
| `/practice` | `PracticeSession` | Adaptive practice from question database |
| `/solve` | `CustomProblem` | "Input your question" β€” Gemini/SLM processes user-submitted problems |
| `/session-report` | `SessionReport` | End-of-session summary with performance analytics |
#### Core Frontend Components
```
src/
β”œβ”€β”€ components/
β”‚ β”œβ”€β”€ ProblemDisplay/
β”‚ β”‚ β”œβ”€β”€ MathProblem.tsx # Renders word problem text
β”‚ β”‚ β”œβ”€β”€ HintScaffold.tsx # L1/L2/L3/L4 progressive hint UI
β”‚ β”‚ β”œβ”€β”€ AnswerInput.tsx # Numeric/expression answer entry
β”‚ β”‚ └── SolutionReveal.tsx # L4 step-by-step solution display
β”‚ β”œβ”€β”€ Adaptive/
β”‚ β”‚ β”œβ”€β”€ DifficultyIndicator.tsx # Visual current-level indicator
β”‚ β”‚ β”œβ”€β”€ ProgressBar.tsx # Session progress (e.g., 7/20)
β”‚ β”‚ └── SessionTimer.tsx # Time tracking per problem
β”‚ β”œβ”€β”€ Dashboard/
β”‚ β”‚ β”œβ”€β”€ EloChart.tsx # Elo rating over time (Recharts)
β”‚ β”‚ β”œβ”€β”€ TopicHeatmap.tsx # Performance by math topic
β”‚ β”‚ β”œβ”€β”€ LDSMCSPanel.tsx # Language Dependency & Math Confidence
β”‚ β”‚ └── StreakBadge.tsx # Gamification elements
β”‚ └── Shared/
β”‚ β”œβ”€β”€ BilingualToggle.tsx # EN/ES interface language switch
β”‚ β”œβ”€β”€ MathRenderer.tsx # KaTeX for math expressions
β”‚ └── LoadingSkeleton.tsx
β”œβ”€β”€ lib/
β”‚ β”œβ”€β”€ adaptive-engine.ts # Elo + BKT + Thompson Sampling (client-side)
β”‚ β”œβ”€β”€ feature-engineer.ts # LDS & MCS computation
β”‚ β”œβ”€β”€ firebase.ts # Firebase SDK initialization
β”‚ └── llm-client.ts # Gemini/SLM API abstraction
β”œβ”€β”€ hooks/
β”‚ β”œβ”€β”€ useAdaptiveSession.ts # Manages session state + engine calls
β”‚ β”œβ”€β”€ useStudentProfile.ts # Reads/writes Firestore student state
β”‚ └── useQuestionQueue.ts # Pre-fetches next batch of questions
└── types/
└── index.ts # TypeScript interfaces for all data structures
```
#### Hint Scaffold UI Flow
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Problem displayed in original β”‚
β”‚ English at student's current level β”‚
β”‚ β”‚
β”‚ [Try to solve] [I need a hint β†’] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ click
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ L1: Simplified English β”‚
β”‚ "A store has 24 apples..." β”‚
β”‚ β”‚
β”‚ [Got it!] [Still stuck β†’] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ click
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ L2: Bilingual Keywords Inline β”‚
β”‚ "A store has 24 apples (manzanas)" β”‚
β”‚ "divided equally (dividido β”‚
β”‚ igualmente) among 6 boxes" β”‚
β”‚ β”‚
β”‚ [Got it!] [Still stuck β†’] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ click
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ L3: Full Spanish Translation β”‚
β”‚ "Una tienda tiene 24 manzanas β”‚
β”‚ divididas igualmente entre 6 β”‚
β”‚ cajas. ΒΏCuΓ‘ntas manzanas hay β”‚
β”‚ en cada caja?" β”‚
β”‚ β”‚
β”‚ [Got it!] [Show me the answer β†’] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ click
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ L4: Step-by-Step Solution β”‚
β”‚ Step 1: Identify β€” 24 Γ· 6 β”‚
β”‚ Step 2: Calculate β€” 24 Γ· 6 = 4 β”‚
β”‚ Step 3: Answer β€” 4 apples per box β”‚
β”‚ β”‚
β”‚ [Next Problem β†’] β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
Each hint interaction is logged with timestamp to compute `escalation_speed` and `scaffold_time_ratio` for the LDS formula.
---
### 2.2 Adaptive Engine (Client-Side JavaScript)
The adaptive engine runs **entirely in the browser** β€” no server round-trip needed for difficulty decisions. This ensures instant feedback and works offline after initial question batch load.
#### Engine Components
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Adaptive Engine (client-side) β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Elo Rating β”‚ β”‚ BKT β”‚ β”‚ Thompson β”‚ β”‚
β”‚ β”‚ System β”‚ β”‚ Engine β”‚ β”‚ Sampler β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Updates β”‚ β”‚ P(know) β”‚ β”‚ Beta prior β”‚ β”‚
β”‚ β”‚ student & β”‚ β”‚ per β”‚ β”‚ per level, β”‚ β”‚
β”‚ β”‚ question β”‚ β”‚ topic β”‚ β”‚ ZPD window β”‚ β”‚
β”‚ β”‚ ratings β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β–Ό β–Ό β–Ό β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Decision Orchestrator β”‚ β”‚
β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Input: weighted_outcome, features β”‚ β”‚
β”‚ β”‚ Output: next_level, decision_type β”‚ β”‚
β”‚ β”‚ (increase/maintain/decrease) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
#### Elo Update Formula
```
weighted_outcome = {
no_hint: 1.00 (solved without any scaffold)
L1_only: 0.75 (needed simplified English)
L2_used: 0.50 (needed bilingual keywords)
L3_used: 0.25 (needed full translation)
L4_used: 0.00 (needed answer reveal)
}
E_student = 1 / (1 + 10^((R_question - R_student) / 400))
R_student_new = R_student + K Γ— (weighted_outcome - E_student)
K = 32 (default), increased to 48 for first 10 interactions (cold-start acceleration)
```
#### BKT Parameters (per topic)
| Parameter | Symbol | Default | Description |
|---|---|---|---|
| Prior knowledge | P(Lβ‚€) | 0.10 | Initial probability student knows topic |
| Learn rate | P(T) | 0.15 | Probability of learning per opportunity |
| Slip | P(S) | 0.10 | Probability of incorrect despite knowing |
| Guess | P(G) | 0.25 | Probability of correct despite not knowing |
Slip is adjusted based on hint usage:
```
P(S)_adjusted = P(S) Γ— (1 + 0.5 Γ— hint_depth_normalized)
```
This models the intuition that using more scaffolds means apparent "correctness" is less certain.
#### Thompson Sampling with ZPD Windowing
```
For each candidate level l in ZPD window [current - 2, current + 3]:
sample ΞΈ_l ~ Beta(Ξ±_l, Ξ²_l)
score_l = ΞΈ_l Γ— proximity_bonus(l, target_elo)
Select level = argmax(score_l)
proximity_bonus(l, target) = exp(-0.5 Γ— ((elo_l - target) / 100)Β²)
```
ZPD window is asymmetric (+3 upward, -2 downward) to encourage upward progression while preventing catastrophic failure.
#### Progression Decision Rules
| Condition | Decision | Action |
|---|---|---|
| weighted_outcome β‰₯ 0.75 AND P(know) β‰₯ 0.70 | **Increase** | Move up 1 sub-level |
| weighted_outcome β‰₯ 0.85 AND streak β‰₯ 3 | **Skip** | Move up 2 sub-levels |
| 0.40 ≀ weighted_outcome < 0.75 | **Maintain** | Stay at current level |
| weighted_outcome < 0.40 OR streak_wrong β‰₯ 2 | **Decrease** | Move down 1 sub-level |
| weighted_outcome < 0.25 AND P(know) < 0.30 | **Rapid Decrease** | Move down 2 sub-levels |
---
### 2.3 Firebase Backend
**Services Used**:
- Firebase Authentication (Google Sign-In, Email/Password)
- Cloud Firestore (student state, question database, session logs)
- Cloud Functions (LLM API calls, batch question generation, session reports)
- Firebase Hosting (static frontend assets)
#### Firestore Data Model
```
firestore/
β”œβ”€β”€ users/
β”‚ └── {uid}/
β”‚ β”œβ”€β”€ profile: {
β”‚ β”‚ displayName, email, gradeLevel, preferredLanguage,
β”‚ β”‚ createdAt, lastActive
β”‚ β”‚ }
β”‚ β”œβ”€β”€ adaptiveState: {
β”‚ β”‚ currentElo: number, // e.g., 1050
β”‚ β”‚ currentLevel: string, // e.g., "2.1"
β”‚ β”‚ totalInteractions: number,
β”‚ β”‚ topicMastery: { // BKT P(know) per topic
β”‚ β”‚ "arithmetic": 0.72,
β”‚ β”‚ "fractions": 0.45,
β”‚ β”‚ "algebra_basic": 0.31,
β”‚ β”‚ ...
β”‚ β”‚ },
β”‚ β”‚ thompsonPriors: { // Beta(Ξ±,Ξ²) per level
β”‚ β”‚ "1.1": { alpha: 12, beta: 3 },
β”‚ β”‚ "1.2": { alpha: 8, beta: 5 },
β”‚ β”‚ ...
β”‚ β”‚ },
β”‚ β”‚ featureAverages: {
β”‚ β”‚ avgLDS: 0.42,
β”‚ β”‚ avgMCS: 0.61,
β”‚ β”‚ recentLDS_5: [0.3, 0.4, 0.5, 0.35, 0.45],
β”‚ β”‚ recentMCS_5: [0.6, 0.65, 0.58, 0.62, 0.7]
β”‚ β”‚ },
β”‚ β”‚ streakCount: number,
β”‚ β”‚ lastUpdated: timestamp
β”‚ β”‚ }
β”‚ └── sessions/
β”‚ └── {sessionId}/
β”‚ β”œβ”€β”€ metadata: {
β”‚ β”‚ startTime, endTime, questionsAttempted,
β”‚ β”‚ questionsCorrect, avgWeightedOutcome,
β”‚ β”‚ startElo, endElo, sessionLDS, sessionMCS
β”‚ β”‚ }
β”‚ └── interactions/
β”‚ └── {interactionId}: {
β”‚ questionId, level, topic,
β”‚ startTime, endTime, timeSpentMs,
β”‚ hintsUsed: [0,1,2,3,4], // which levels accessed
β”‚ hintTimestamps: { L1: ts, L2: ts, ... },
β”‚ maxHintLevel: number,
β”‚ answer: string,
β”‚ isCorrect: boolean,
β”‚ attempts: number,
β”‚ weightedOutcome: number,
β”‚ lds: number,
β”‚ mcs: number,
β”‚ eloBeforeUpdate: number,
β”‚ eloAfterUpdate: number,
β”‚ adaptiveDecision: string
β”‚ }
β”‚
β”œβ”€β”€ questions/
β”‚ └── {questionId}: {
β”‚ id, level, topic, subtopic,
β”‚ problemText, answer, answerNumeric,
β”‚ solutionSteps: [...],
β”‚ scaffolds: {
β”‚ L1_simplified: string,
β”‚ L2_bilingual: string,
β”‚ L3_spanish: string,
β”‚ L4_solution: string
β”‚ },
β”‚ readability: {
β”‚ fleschKincaid: number,
β”‚ wordCount: number,
β”‚ difficultWords: number,
β”‚ avgSyllables: number
β”‚ },
β”‚ eloRating: number,
β”‚ timesServed: number,
β”‚ avgOutcome: number,
β”‚ metadata: {
β”‚ source: "curated" | "generated",
β”‚ generatedBy: "gemini-2.0" | "qwen2.5-3b" | null,
β”‚ reviewedBy: string | null,
β”‚ createdAt: timestamp
β”‚ }
β”‚ }
β”‚
β”œβ”€β”€ questionIndex/ // Denormalized for fast queries
β”‚ └── byLevel/
β”‚ └── {level}: {
β”‚ questionIds: [...],
β”‚ count: number
β”‚ }
β”‚
└── analytics/ // Aggregated (Cloud Functions)
β”œβ”€β”€ dailyStats/
β”‚ └── {date}: { activeUsers, sessionsCompleted, ... }
└── cohortProgress/
└── {cohortId}: { avgElo, avgLDS, avgMCS, ... }
```
#### Firestore Security Rules
```javascript
rules_version = '2';
service cloud.firestore {
match /databases/{database}/documents {
// Users can only read/write their own data
match /users/{uid}/{document=**} {
allow read, write: if request.auth != null && request.auth.uid == uid;
}
// Questions are readable by all authenticated users
match /questions/{questionId} {
allow read: if request.auth != null;
allow write: if false; // Only admin/Cloud Functions
}
// Question index readable by all authenticated users
match /questionIndex/{document=**} {
allow read: if request.auth != null;
allow write: if false;
}
// Analytics only accessible by admin
match /analytics/{document=**} {
allow read, write: if false; // Cloud Functions only
}
}
}
```
---
### 2.4 Cloud Functions (Serverless Backend)
```
functions/
β”œβ”€β”€ onUserCreate.ts # Initialize adaptive state for new user
β”œβ”€β”€ generateScaffolds.ts # Call Gemini/SLM to create L1-L4 for a problem
β”œβ”€β”€ batchGenerateQuestions.ts # Generate next 20 questions for session queue
β”œβ”€β”€ processCustomProblem.ts # "Input your question" flow
β”œβ”€β”€ generateSessionReport.ts # End-of-session analytics
β”œβ”€β”€ updateQuestionStats.ts # Update question difficulty from outcomes
└── scheduledAnalytics.ts # Daily aggregation (cron-triggered)
```
#### Key Cloud Function: `generateScaffolds`
```typescript
// Triggered when student submits a custom problem or when
// pre-generating scaffolds for database questions
interface ScaffoldRequest {
problemText: string;
studentGradeLevel: number;
currentLDS: number; // Informs simplification level
}
interface ScaffoldResponse {
L1_simplified: string; // Simplified English
L2_bilingual: string; // English with inline Spanish keywords
L3_spanish: string; // Full Spanish translation
L4_solution: string; // Step-by-step solution
answer: string;
answerNumeric: number;
}
// Prompt template for LLM
const SCAFFOLD_PROMPT = `
You are a bilingual math tutor helping Spanish-speaking students
(grades 6-8) learn math in English.
Given this math word problem:
"{problemText}"
Generate 4 scaffold levels:
**L1 (Simplified English):** Rewrite using shorter sentences,
simpler vocabulary (grade {adjustedGrade} reading level).
Keep all math content identical.
**L2 (Bilingual Keywords):** Take the original problem and add
Spanish translations in parentheses for key math and context
vocabulary. Format: "English word (palabra en espaΓ±ol)".
**L3 (Full Spanish Translation):** Translate the complete problem
to natural, grade-appropriate Spanish. Ensure mathematical
precision is maintained.
**L4 (Step-by-Step Solution):** Provide a clear, numbered
step-by-step solution in English with the final numerical answer.
Return as JSON with keys: L1_simplified, L2_bilingual, L3_spanish,
L4_solution, answer, answerNumeric.
`;
```
#### Key Cloud Function: `batchGenerateQuestions`
```typescript
// Called when student reaches question 17 of 20 (prefetch trigger)
// Selects next 20 questions from database based on adaptive state
export const batchGenerateQuestions = onCall(async (request) => {
const { uid } = request.auth;
const state = await getAdaptiveState(uid);
// Thompson Sampling selects level distribution for next batch
const levelDistribution = thompsonSampleBatch(
state.thompsonPriors,
state.currentLevel,
batchSize: 20
);
// e.g., { "2.1": 5, "2.2": 8, "2.3": 5, "2.4": 2 }
// Select questions avoiding recently served ones
const recentIds = await getRecentQuestionIds(uid, lookback: 100);
const questions = await selectQuestions(
levelDistribution,
excludeIds: recentIds,
topicBalance: state.topicMastery // Favor weaker topics
);
// Ensure all questions have scaffolds generated
const withScaffolds = await ensureScaffoldsGenerated(questions);
return { questions: withScaffolds, sessionBatchId: generateId() };
});
```
---
### 2.5 LLM Service Layer
#### V1: Gemini API (Current)
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” HTTPS/REST β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Cloud β”‚ ──────────────────►│ Google Gemini β”‚
β”‚ Function β”‚ ◄──────────────────│ 2.0 Flash API β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Cost: ~$0.075 per 1M input tokens, ~$0.30 per 1M output tokens
Latency: 200-800ms per scaffold generation
Rate limit: 60 RPM (free tier), 1000 RPM (paid)
```
#### V2: Qwen2.5-3B SLM (Planned)
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” HTTPS/REST β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Cloud β”‚ ──────────────────►│ HF Inference Endpoint β”‚
β”‚ Function β”‚ ◄──────────────────│ Qwen2.5-3B-Instruct β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ (QLoRA fine-tuned) β”‚
β”‚ GPU: T4 or L4 β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Cost: ~$0.60/hr (T4) or ~$1.04/hr (L4)
Latency: 100-400ms per scaffold generation
Rate limit: Unlimited (dedicated endpoint)
```
#### LLM Client Abstraction
```typescript
// lib/llm-client.ts β€” Provider-agnostic interface
interface LLMProvider {
generateScaffolds(problem: string, context: ScaffoldContext): Promise<ScaffoldResponse>;
generateQuestion(level: string, topic: string): Promise<QuestionWithScaffolds>;
validateAnswer(problem: string, studentAnswer: string, correctAnswer: string): Promise<AnswerValidation>;
}
class GeminiProvider implements LLMProvider { ... }
class QwenSLMProvider implements LLMProvider { ... }
// Factory with fallback
function createLLMClient(): LLMProvider {
if (config.useSLM && config.slmEndpointAvailable) {
return new QwenSLMProvider(config.slmEndpoint);
}
return new GeminiProvider(config.geminiApiKey);
}
```
---
### 2.6 SLM Fine-Tuning Pipeline
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Training β”‚ β”‚ Fine-Tune β”‚ β”‚ Deploy β”‚
β”‚ Data Prep │───►│ QLoRA SFT │───►│ HF Inference EP β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Step 1: Collect 2,000-5,000 scaffold examples from Gemini V1 usage
Step 2: Human review + quality filter β†’ ~1,500 gold examples
Step 3: QLoRA fine-tune Qwen2.5-3B-Instruct
Step 4: Evaluate on held-out test set (BLEU, math accuracy, readability)
Step 5: Deploy to HF Inference Endpoint
Step 6: Shadow-test alongside Gemini (serve both, compare quality)
Step 7: Full cutover when SLM matches Gemini quality
```
**Fine-tuning Configuration:**
| Parameter | Value | Rationale |
|---|---|---|
| Base model | Qwen2.5-3B-Instruct | Best math+Spanish at 3B scale |
| Method | QLoRA (4-bit NF4) | Fits single 16GB GPU |
| LoRA rank (r) | 32 | Balance quality/efficiency for small dataset |
| LoRA alpha | 64 | Standard 2Γ— rank |
| Target modules | q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj | Full attention + MLP |
| Learning rate | 2e-4 | Standard for QLoRA |
| Epochs | 3-5 | Small dataset, monitor val loss |
| Batch size | 4 (effective 16 with grad accum) | Memory constraint |
| Max sequence length | 1024 | Sufficient for problem + all 4 scaffolds |
| Warmup ratio | 0.05 | Short warmup for small dataset |
---
## 3. Data Flow Diagrams
### 3.1 Flow A: "Practice Problems" Mode
```
Student clicks "Start Practice"
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 1. Load adaptive state from β”‚
β”‚ Firestore (Elo, BKT, priors) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 2. Thompson Sampling selects β”‚
β”‚ next question level β”‚
β”‚ (ZPD window: current Β±2/+3) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 3. Fetch question from Firestoreβ”‚
β”‚ by level + topic balancing β”‚
β”‚ (avoid recently served) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 4. Display problem, start timer β”‚
β”‚ Student reads and attempts β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Needs hints? β”‚
β–Ό No β–Ό Yes
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Submit β”‚ β”‚ L1 β†’ L2 β†’ L3 β†’ L4β”‚
β”‚ answer β”‚ β”‚ (each click logged β”‚
β””β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”˜ β”‚ with timestamp) β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚ β”‚
β–Ό β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 5. Compute weighted_outcome β”‚
β”‚ based on correctness + hints β”‚
β”‚ Compute LDS and MCS β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 6. Update Elo (student + Q) β”‚
β”‚ Update BKT P(know) for topic β”‚
β”‚ Update Thompson Beta priors β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 7. Progression decision: β”‚
β”‚ increase / maintain / decreaseβ”‚
β”‚ Select next level β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 8. Save interaction to Firestoreβ”‚
β”‚ Display "Next Problem" β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”΄β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Q17 of 20? β”‚
β–Ό Yes β–Ό No
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Prefetch β”‚ β”‚ Loop to β”‚
β”‚ next batch β”‚ β”‚ step 2 β”‚
β”‚ (Cloud Fn) β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
At Q20: β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 9. Generate session report β”‚
β”‚ (Cloud Function) β”‚
β”‚ Show summary to student β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
### 3.2 Flow B: "Input Your Question" Mode
```
Student types/pastes a math word problem
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 1. Cloud Function: β”‚
β”‚ processCustomProblem β”‚
β”‚ - Validate it's a math β”‚
β”‚ word problem β”‚
β”‚ - Extract answer/solution β”‚
β”‚ - Call Gemini/SLM to generateβ”‚
β”‚ L1, L2, L3, L4 scaffolds β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 2. Estimate difficulty level β”‚
β”‚ using readability metrics β”‚
β”‚ (FK grade, word count, etc.) β”‚
β”‚ Map to nearest Elo rating β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 3. Display problem with β”‚
β”‚ scaffold buttons active β”‚
β”‚ (same UI as Practice mode) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 4. Student interacts, solves β”‚
β”‚ Same hint tracking as β”‚
β”‚ Practice mode β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 5. Update adaptive state β”‚
β”‚ (Elo, BKT, Thompson) β”‚
β”‚ Log interaction β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ 6. Offer: "Try another?" or β”‚
β”‚ "Switch to Practice Mode" β”‚
β”‚ (where engine auto-selects) β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## 4. API Contracts
### 4.1 Client β†’ Cloud Functions
```typescript
// POST /generateScaffolds
interface GenerateScaffoldsRequest {
problemText: string;
gradeLevel: number; // 6, 7, or 8
currentLDS: number; // 0.0-1.0, informs simplification
}
interface GenerateScaffoldsResponse {
scaffolds: {
L1_simplified: string;
L2_bilingual: string;
L3_spanish: string;
L4_solution: string;
};
answer: string;
answerNumeric: number;
estimatedLevel: string; // e.g., "2.3"
estimatedElo: number; // e.g., 1100
processingTimeMs: number;
}
// POST /batchGenerateQuestions
interface BatchRequest {
batchSize: number; // default 20
// Auth token provides uid β†’ adaptive state looked up server-side
}
interface BatchResponse {
questions: QuestionWithScaffolds[];
sessionBatchId: string;
}
// POST /submitInteraction
interface InteractionSubmission {
sessionId: string;
questionId: string;
answer: string;
isCorrect: boolean;
timeSpentMs: number;
hintsUsed: number[]; // [0], [0,1], [0,1,2], etc.
hintTimestamps: Record<string, number>;
attempts: number;
}
interface InteractionResponse {
weightedOutcome: number;
lds: number;
mcs: number;
newElo: number;
newLevel: string;
decision: "increase" | "maintain" | "decrease" | "skip" | "rapid_decrease";
nextQuestion: QuestionWithScaffolds; // Pre-selected
}
// POST /generateSessionReport
interface SessionReportRequest {
sessionId: string;
}
interface SessionReportResponse {
summary: {
questionsAttempted: number;
questionsCorrect: number;
avgWeightedOutcome: number;
eloChange: number;
topicsStrong: string[];
topicsWeak: string[];
avgLDS: number;
avgMCS: number;
languageProgressNote: string; // Generated text about L2 progress
};
recommendations: string[]; // e.g., "Focus on fractions vocabulary"
}
```
---
## 5. Deployment Architecture
### 5.1 V1 Deployment (MVP)
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Firebase Project β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Firebase β”‚ β”‚ Cloud β”‚ β”‚ Cloud β”‚ β”‚
β”‚ β”‚ Hosting β”‚ β”‚ Firestore β”‚ β”‚ Functions β”‚ β”‚
β”‚ β”‚ (Next.js) β”‚ β”‚ (Database) β”‚ β”‚ (Node.js 20) β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β”‚ Static + β”‚ β”‚ Student β”‚ β”‚ LLM calls β”‚ β”‚
β”‚ β”‚ SSR pages β”‚ β”‚ state, β”‚ β”‚ Batch gen β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ questions, β”‚ β”‚ Reports β”‚ β”‚
β”‚ β”‚ β”‚ β”‚ sessions β”‚ β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
HTTPS β”‚
β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Google Gemini β”‚
β”‚ 2.0 Flash API β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Estimated monthly cost (100 students, 5 sessions/week):
- Firebase Hosting: Free tier (~$0)
- Firestore: ~$5/mo (reads/writes within free tier mostly)
- Cloud Functions: ~$10/mo (invocations + compute)
- Gemini API: ~$15-25/mo (scaffold generation)
- Total: ~$30-40/mo
```
### 5.2 V2 Deployment (SLM)
```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ Firebase Project β”‚
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Firebase β”‚ β”‚ Cloud β”‚ β”‚ Cloud β”‚ β”‚
β”‚ β”‚ Hosting β”‚ β”‚ Firestore β”‚ β”‚ Functions β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚ β”‚
β–Ό β–Ό β”‚
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ HF Inference β”‚ β”‚ Gemini API β”‚ β”‚
β”‚ Endpoint β”‚ β”‚ (fallback) β”‚ β”‚
β”‚ Qwen2.5-3B β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ QLoRA FT β”‚ β”‚
β”‚ (T4 GPU) β”‚ Shadow testing: β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ Both called, SLM β”‚
response served, β”‚
Gemini response β”‚
logged for QA β”‚
β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
Estimated monthly cost (100 students):
- Firebase: ~$15/mo (same as V1)
- HF Inference Endpoint (T4, scale-to-zero): ~$50-100/mo
(active only during school hours, ~8hrs/day Γ— 20 days)
- Gemini fallback: ~$5/mo (only when SLM is cold)
- Total: ~$70-120/mo (but no per-token costs at scale)
```
### 5.3 V3 Deployment (Scale)
```
When student count exceeds 500+, migrate to:
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ Vercel β”‚ β”‚ Firebase β”‚ β”‚ Cloud Run β”‚ β”‚
β”‚ β”‚ (Next.js) β”‚ β”‚ Firestore β”‚ β”‚ (API server) β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚
β”‚ β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”Όβ”€β”€β”€β”€β”€β”€β” β”‚
β”‚ β”‚ β”‚ β”‚ β”‚
β”‚ β–Ό β–Ό β”‚ β”‚
β”‚ β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β” β”‚ β”‚
β”‚ β”‚ HF Inference EP β”‚ β”‚ IRT/DKT Model β”‚ β”‚ β”‚
β”‚ β”‚ Qwen2.5-3B β”‚ β”‚ Server β”‚ β”‚ β”‚
β”‚ β”‚ (Auto-scaling) β”‚ β”‚ (Python/FastAPI)β”‚ β”‚ β”‚
β”‚ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜ β”‚ β”‚
β”‚ β”‚ β”‚
β”‚ + Deep Knowledge Tracing (DKT) replaces BKT β”‚ β”‚
β”‚ + IRT item calibration from pooled student data β”‚ β”‚
β”‚ + A/B testing framework for algorithm improvements β”‚ β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```
---
## 6. Technology Stack Summary
| Layer | Technology | Justification |
|---|---|---|
| Frontend Framework | Next.js 14+ (App Router) | SSR for SEO, React ecosystem, TypeScript |
| UI Styling | Tailwind CSS + shadcn/ui | Rapid prototyping, consistent design |
| Math Rendering | KaTeX | Fast client-side LaTeX rendering |
| Charts | Recharts | React-native charting for dashboards |
| Authentication | Firebase Auth | Google Sign-In, simple integration |
| Database | Cloud Firestore | Real-time sync, offline support, serverless |
| Serverless Functions | Firebase Cloud Functions (Node.js 20) | Low latency, Firebase integration |
| LLM (V1) | Google Gemini 2.0 Flash | Low cost, fast, good multilingual |
| SLM (V2) | Qwen2.5-3B-Instruct (QLoRA fine-tuned) | Best math+Spanish at 3B, Apache 2.0 |
| SLM Hosting | HF Inference Endpoints (T4, scale-to-zero) | Cost-effective, no infra management |
| Adaptive Engine | Client-side TypeScript | Zero-latency decisions, works offline |
| State Management | Zustand + Firestore sync | Lightweight, persists across sessions |
| Testing | Vitest + Playwright | Unit + E2E testing |
| CI/CD | GitHub Actions | Automated testing + Firebase deploy |
| Monitoring | Firebase Analytics + Crashlytics | User behavior + error tracking |
---
## 7. Security & Privacy Considerations
### 7.1 Data Protection
- **COPPA Compliance**: Students are minors (ages 11-14). No personally identifiable information stored beyond email/display name. No third-party tracking.
- **FERPA Alignment**: Performance data (Elo, LDS, MCS) is associated with uid only. Teachers/admins see aggregate data, never individual student identifiers.
- **Data Encryption**: Firestore encrypts at rest (AES-256). All API calls over HTTPS/TLS 1.3.
### 7.2 API Security
- Firebase Auth tokens required for all Cloud Function calls
- Gemini/SLM API keys stored in Firebase environment secrets (never client-side)
- Rate limiting on Cloud Functions to prevent abuse (max 10 scaffold generations per minute per user)
### 7.3 Content Safety
- All LLM-generated scaffolds pass through a validation function checking:
- Mathematical accuracy (answer matches expected)
- Appropriate content (no adult/violent themes)
- Language accuracy (Spanish translation verified against expected pattern)
- Questions from the curated database are pre-reviewed; generated questions flagged for human review
---
## 8. Performance Targets
| Metric | Target | Measurement |
|---|---|---|
| Time to first problem display | < 2 seconds | Lighthouse / Firebase Performance |
| Adaptive decision latency | < 50ms | Client-side (no network) |
| Scaffold generation (Gemini) | < 1.5 seconds | Cloud Function logs |
| Scaffold generation (SLM) | < 800ms | HF Inference EP metrics |
| Batch prefetch trigger β†’ ready | < 5 seconds | 20 questions fetched at Q17 |
| Offline capability | Full session | After initial batch load |
| Concurrent users (V1) | 50 | Firebase free/Blaze tier |
| Concurrent users (V2) | 500+ | HF auto-scaling endpoint |