File size: 1,259 Bytes
1da5550
 
 
 
 
 
 
 
 
 
 
06c5ee2
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
87ab66f
 
3b14fb0
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
---
title: Exam Simulator
emoji: 
colorFrom: indigo
colorTo: gray
sdk: gradio
sdk_version: 5.45.0
app_file: app.py
pinned: false
---

# SPM Exam Simulator (local PDF → JSON → SQLite → Simulator)

This repo provides:
- `ocr_agent.py` — PDF → JSON (extracts text + images).
- `merge_questions.py` — merge processed JSONs into `questions.json` and populate `exam.db`.
- `app.py` — Gradio-based Paper 2 (MCQ) simulator that shows questions + diagrams, lets students select A/B/C/D, and reveals score + AI advice after submission.
- `agents.py` — Analyzer & Coach (ZhipuAI) wrappers.

## Quick overview
1. Put your PDFs under a local folder (recommended: `data/raw/<subject>/<year>/paper2.pdf`).
2. Run OCR to produce processed JSONs.
3. Run `merge_questions.py` to create `questions.json` and populate `exam.db`.
4. Start the simulator: `python app.py`.

## Requirements
- Python 3.10+ (the Docker image used Python 3.10)
- Install Tesseract OCR (system binary).
  - **Windows**: download and install [Tesseract for Windows] (e.g. from UB Mannheim)
  - **Linux (Debian/Ubuntu)**: `sudo apt-get install tesseract-ocr`
  - **macOS**: `brew install tesseract`
- Install Python packages:
  ```bash
  python -m pip install -r requirements.txt