Spaces:
Running
Running
| title: MCQ Generator | |
| emoji: π | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: streamlit | |
| sdk_version: 1.33.0 | |
| app_file: app/main.py | |
| pinned: false | |
| # π MCQ Generator β Automatic Multiple Choice Question Generator | |
| > **An end-to-end NLP pipeline that reads any text passage and automatically generates a complete multiple-choice quiz with scoring and explanations.** | |
| Built as a course project for an NLP curriculum covering Modules IβIV: tokenization, word embeddings, transformers, and natural language generation. | |
| --- | |
| ## π Table of Contents | |
| 1. [What This Project Does](#what-this-project-does) | |
| 2. [Live Demo](#live-demo) | |
| 3. [How It Works β The Full Pipeline](#how-it-works--the-full-pipeline) | |
| 4. [NLP Techniques Used](#nlp-techniques-used) | |
| 5. [Project Structure](#project-structure) | |
| 6. [Each File Explained](#each-file-explained) | |
| 7. [Tech Stack](#tech-stack) | |
| 8. [Setup & Installation](#setup--installation) | |
| 9. [Running the App](#running-the-app) | |
| 10. [Testing Each Module](#testing-each-module) | |
| 11. [Sample Output](#sample-output) | |
| 12. [What Makes a Good Passage](#what-makes-a-good-passage) | |
| 13. [Known Limitations](#known-limitations) | |
| 14. [Future Work](#future-work) | |
| 15. [Related Research](#related-research) | |
| 16. [Course Outcomes Covered](#course-outcomes-covered) | |
| --- | |
| ## What This Project Does | |
| Given any factual text passage, this system: | |
| 1. **Extracts** the most important sentences using TF-IDF ranking | |
| 2. **Identifies** answer candidates using Named Entity Recognition (NER) | |
| 3. **Generates** natural language questions using a T5 transformer model | |
| 4. **Creates** plausible wrong options (distractors) using WordNet and NER | |
| 5. **Presents** an interactive quiz with scoring and per-question explanations | |
| **Example:** | |
| Input passage: | |
| ``` | |
| Albert Einstein was born on March 14, 1879, in Ulm, Germany. | |
| He was awarded the Nobel Prize in Physics in 1921 for his | |
| discovery of the photoelectric effect. | |
| ``` | |
| Generated MCQ: | |
| ``` | |
| Q: Where was Albert Einstein born? | |
| A. France | |
| B. Germany β | |
| C. United States | |
| D. Switzerland | |
| ``` | |
| --- | |
| ## Live Demo | |
| ```bash | |
| streamlit run app/main.py | |
| ``` | |
| Opens at `http://localhost:8501` in your browser. | |
| --- | |
| ## How It Works β The Full Pipeline | |
| ``` | |
| Raw Text Passage | |
| β | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββ | |
| β STEP 1: PREPROCESSING (preprocessor.py) β | |
| β β | |
| β β’ Split into sentences (spaCy) β | |
| β β’ Rank by TF-IDF score (scikit-learn) β | |
| β β’ Extract Named Entities (spaCy NER) β | |
| β β’ Filter answer candidates (blacklist) β | |
| βββββββββββββββββββ¬ββββββββββββββββββββββββββββ | |
| β top sentences + answer candidates | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββ | |
| β STEP 2: QUESTION GENERATION β | |
| β (question_generator.py) β | |
| β β | |
| β β’ Highlight answer in sentence with <hl> β | |
| β β’ Feed to T5 transformer model β | |
| β β’ Generate 3 candidate questions β | |
| β β’ Validate: reject circular/vague Qs β | |
| βββββββββββββββββββ¬ββββββββββββββββββββββββββββ | |
| β (question, answer) pairs | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββ | |
| β STEP 3: DISTRACTOR GENERATION β | |
| β (distractor_generator.py) β | |
| β β | |
| β Strategy 1: Same-type NER entities β | |
| β from the passage β | |
| β Strategy 2: WordNet hyponym siblings β | |
| β Strategy 3: Cross-label fallback β | |
| βββββββββββββββββββ¬ββββββββββββββββββββββββββββ | |
| β 3 wrong options per question | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββ | |
| β STEP 4: MCQ ASSEMBLY + VALIDATION β | |
| β (mcq_builder.py) β | |
| β β | |
| β β’ Combine answer + distractors β | |
| β β’ Shuffle options randomly β | |
| β β’ Quality gate: dedup, similarity check β | |
| β β’ Return list of MCQ objects β | |
| βββββββββββββββββββ¬ββββββββββββββββββββββββββββ | |
| β validated MCQ list | |
| βΌ | |
| βββββββββββββββββββββββββββββββββββββββββββββββ | |
| β STEP 5: QUIZ UI + SCORING β | |
| β (app/main.py + evaluator.py) β | |
| β β | |
| β β’ Streamlit 3-screen app β | |
| β β’ Input β Quiz β Results β | |
| β β’ Score, feedback, explanations β | |
| βββββββββββββββββββββββββββββββββββββββββββββββ | |
| ``` | |
| --- | |
| ## NLP Techniques Used | |
| ### Module I β Foundational NLP | |
| | Technique | Where Used | Purpose | | |
| |---|---|---| | |
| | Tokenization | `preprocessor.py` | Split text into sentences and tokens using spaCy | | |
| | Lemmatization | `preprocessor.py` | Normalize word forms for TF-IDF | | |
| | Stop word removal | `preprocessor.py` | Filter noise before TF-IDF scoring | | |
| | Named Entity Recognition (NER) | `preprocessor.py` | Find PERSON, ORG, DATE, GPE as answer candidates | | |
| | POS Tagging | `preprocessor.py` | Identify nouns and proper nouns | | |
| | WordNet | `distractor_generator.py` | Find semantically related words as distractors | | |
| | Synsets / Hyponyms | `distractor_generator.py` | Navigate WordNet hierarchy for same-category words | | |
| ### Module II β Word Representation | |
| | Technique | Where Used | Purpose | | |
| |---|---|---| | |
| | TF-IDF | `preprocessor.py` | Rank sentences by information density | | |
| | Word Embeddings (GloVe) | `distractor_generator.py` | Optional cosine-similarity based distractor finding | | |
| **TF-IDF explained:** | |
| - **TF (Term Frequency)** = how often a word appears in *this* sentence | |
| - **IDF (Inverse Document Frequency)** = how rare the word is across *all* sentences | |
| - High TF-IDF score = sentence contains rare, informative words β good question source | |
| ### Module III β Deep Learning for NLP | |
| | Technique | Where Used | Purpose | | |
| |---|---|---| | |
| | Transformers | `question_generator.py` | T5 model for question generation | | |
| | Transfer Learning | `question_generator.py` | Using pre-trained T5 fine-tuned on SQuAD | | |
| | Seq2Seq | `question_generator.py` | Encoder-decoder architecture of T5 | | |
| | Beam Search | `question_generator.py` | Generate multiple question candidates, pick best | | |
| ### Module IV β Advanced NLP | |
| | Technique | Where Used | Purpose | | |
| |---|---|---| | |
| | T5 (Text-to-Text Transfer Transformer) | `question_generator.py` | State-of-the-art QG model | | |
| | Natural Language Generation (NLG) | `question_generator.py` | Generating grammatical questions | | |
| | Subword Tokenization (SentencePiece) | `question_generator.py` | T5's tokenizer handles rare/unknown words | | |
| | Pre-trained Models | `question_generator.py` | `valhalla/t5-small-qg-hl` from HuggingFace | | |
| --- | |
| ## Project Structure | |
| ``` | |
| mcq_generator/ | |
| β | |
| βββ src/ # Core NLP pipeline modules | |
| β βββ __init__.py | |
| β βββ preprocessor.py # Text cleaning, TF-IDF, NER, answer extraction | |
| β βββ question_generator.py # T5-based question generation | |
| β βββ distractor_generator.py # WordNet + NER distractor generation | |
| β βββ mcq_builder.py # Pipeline orchestrator + MCQ dataclass | |
| β βββ evaluator.py # Answer checking and scoring | |
| β | |
| βββ app/ # Streamlit web application | |
| β βββ __init__.py | |
| β βββ main.py # 3-screen app: input β quiz β results | |
| β βββ components.py # Reusable UI components | |
| β | |
| βββ data/ | |
| β βββ sample_passages.json # 5 test passages (ISRO, Gandhi, AI, etc.) | |
| β | |
| βββ models/ # (gitignored) Downloaded model files | |
| β βββ README.md | |
| β | |
| βββ notebooks/ # Jupyter notebooks for exploration | |
| β | |
| βββ config.py # All settings in one place | |
| βββ requirements.txt # Python dependencies | |
| βββ README.md # This file | |
| ``` | |
| --- | |
| ## Each File Explained | |
| ### `config.py` | |
| Central settings file. Every other module imports from here. | |
| - Model name, number of questions, sentence count, file paths | |
| - Change values here to tune the entire system without touching logic files | |
| ### `src/preprocessor.py` | |
| The NLP foundation of the project. | |
| **Key functions:** | |
| - `extract_sentences(text)` β spaCy sentence boundary detection | |
| - `rank_sentences(sentences)` β TF-IDF scoring, returns top N most informative sentences | |
| - `extract_answer_candidates(sentence)` β NER-based extraction with strict quality filters | |
| - `preprocess(text)` β full pipeline, returns structured dict | |
| **Design decisions:** | |
| - Only `PERSON`, `ORG`, `GPE`, `DATE`, `EVENT`, `WORK_OF_ART` NER labels are accepted as answers | |
| - A `BLACKLIST` of 30+ generic words ("annual", "various", "Moon") prevents trivial answers | |
| - Answers are sorted by priority: PERSON > ORG/GPE > DATE > others | |
| ### `src/question_generator.py` | |
| Uses the `valhalla/t5-small-qg-hl` model β a T5-small fine-tuned on SQuAD for question generation. | |
| **How T5 QG works:** | |
| ``` | |
| Input: "generate question: ISRO was founded in <hl> 1969 <hl> by Vikram Sarabhai." | |
| Output: "In what year was ISRO founded?" | |
| ``` | |
| **Key functions:** | |
| - `highlight_answer(sentence, answer)` β wraps answer in `<hl>` tags | |
| - `generate_question(sentence, answer)` β beam search with 5 beams, 3 candidates | |
| - `answer_is_addressable(question, answer)` β rejects circular, vague, or short questions | |
| **Quality filters applied:** | |
| - Must start with a question word (what/who/when/where/which/how) | |
| - Answer must NOT appear in the question | |
| - Abbreviation trap detection (e.g. rejects Q: "What does ISRO stand for?" when A is the full name) | |
| - Minimum 5 words | |
| ### `src/distractor_generator.py` | |
| Generates 3 plausible wrong answer options. Uses a priority-based strategy chain. | |
| **Strategy 1 β Same-label NER (best):** | |
| Finds other entities of the same NER type from the passage. | |
| ``` | |
| Answer: "1969" (DATE) β Distractors: ["1975", "2008", "2023"] (other DATEs in passage) | |
| Answer: "Vikram Sarabhai" (PERSON) β Distractors: ["Kalam", "Dhawan", "Nehru"] | |
| ``` | |
| **Strategy 2 β WordNet hyponyms:** | |
| Navigates the WordNet hierarchy to find sibling words in the same semantic category. | |
| ``` | |
| Answer: "India" β hypernym: "country" β hyponyms: ["China", "Brazil", "Pakistan"] | |
| ``` | |
| **Strategy 3 β Cross-label fallback:** | |
| Uses any other named entity from the passage if strategies 1 and 2 fail. | |
| ### `src/mcq_builder.py` | |
| The single entry point that the UI calls. Orchestrates the entire pipeline. | |
| **MCQ dataclass:** | |
| ```python | |
| @dataclass | |
| class MCQ: | |
| question : str | |
| options : list # 4 shuffled options | |
| correct_index : int # index of correct answer (0-3) | |
| correct_answer : str | |
| explanation : str # original sentence | |
| ``` | |
| **Quality gate `is_valid_mcq()`:** | |
| - No two options can be too similar (catches "WWE" vs "World Wrestling Entertainment") | |
| - Answer must appear exactly once in options | |
| - Maximum 1 generic placeholder option allowed | |
| - Answer must not appear in question text | |
| ### `src/evaluator.py` | |
| Checks answers and computes scores. | |
| **Returns:** | |
| ```python | |
| { | |
| "score" : 7, | |
| "total" : 10, | |
| "percentage": 70.0, | |
| "feedback" : "Good effort! Review the explanations...", | |
| "results" : [ {per-question breakdown} ] | |
| } | |
| ``` | |
| ### `app/main.py` | |
| Streamlit app with 3 screens managed via `st.session_state`: | |
| - **Screen 1 (input):** Text area + question count slider + Generate button | |
| - **Screen 2 (quiz):** One question at a time, radio buttons, Previous/Next/Submit | |
| - **Screen 3 (results):** Score banner + per-question feedback with explanations | |
| ### `app/components.py` | |
| Reusable display functions: | |
| - `render_question_card()` β A/B/C/D labelled radio buttons | |
| - `render_result_card()` β green (correct) / red (wrong) with explanation | |
| - `render_score_summary()` β score banner + metric cards | |
| --- | |
| ## Tech Stack | |
| | Library | Version | Purpose | | |
| |---|---|---| | |
| | `spaCy` | 3.7.4 | Tokenization, NER, POS tagging, sentence splitting | | |
| | `transformers` | 4.38.2 | T5 model for question generation | | |
| | `torch` | 2.2.1 | PyTorch backend for transformers | | |
| | `nltk` | 3.8.1 | WordNet access for distractor generation | | |
| | `scikit-learn` | 1.4.1.post1 | TF-IDF vectorizer | | |
| | `sentencepiece` | latest | T5's subword tokenizer | | |
| | `streamlit` | 1.33.0 | Web UI framework | | |
| | `gensim` | 4.3.2 | Word2Vec / GloVe loading (optional) | | |
| | `numpy` | 1.26.4 | TF-IDF matrix operations | | |
| **Pre-trained model used:** | |
| - `valhalla/t5-small-qg-hl` β T5-small fine-tuned on SQuAD 1.0 for answer-aware question generation using highlight format. Hosted on HuggingFace Hub, downloaded automatically on first run (~240MB). | |
| --- | |
| ## Setup & Installation | |
| ### Prerequisites | |
| - Python 3.11+ | |
| - pip | |
| - Internet connection (first run downloads the T5 model) | |
| ### Step 1 β Clone the repository | |
| ```bash | |
| git clone https://github.com/tanmmayyy/mcq-generator.git | |
| cd mcq-generator | |
| ``` | |
| ### Step 2 β Create a virtual environment | |
| ```bash | |
| python -m venv myenv | |
| # Windows | |
| myenv\Scripts\activate | |
| # Mac/Linux | |
| source myenv/bin/activate | |
| ``` | |
| ### Step 3 β Install dependencies | |
| ```bash | |
| pip install -r requirements.txt | |
| pip install sentencepiece # required for T5 tokenizer | |
| ``` | |
| ### Step 4 β Download spaCy language model | |
| ```bash | |
| # If the default command fails: | |
| pip install https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.7.1/en_core_web_sm-3.7.1-py3-none-any.whl | |
| ``` | |
| ### Step 5 β Verify installation | |
| ```bash | |
| python -c "import spacy; nlp = spacy.load('en_core_web_sm'); print('spaCy OK')" | |
| python -c "from transformers import pipeline; print('Transformers OK')" | |
| ``` | |
| --- | |
| ## Running the App | |
| ```bash | |
| streamlit run app/main.py | |
| ``` | |
| The app opens at `http://localhost:8501`. On first launch, the T5 model downloads (~240MB) and loads into memory β this takes 1β2 minutes. Subsequent launches are fast. | |
| --- | |
| ## Testing Each Module | |
| Run these in order to verify each step of the pipeline works independently: | |
| ```bash | |
| # Step 1 β Test preprocessing (NER, TF-IDF, sentence ranking) | |
| python src/preprocessor.py | |
| # Step 2 β Test question generation (T5 model) | |
| python src/question_generator.py | |
| # Step 3 β Test distractor generation (WordNet + NER) | |
| python src/distractor_generator.py | |
| # Step 4 β Test full pipeline end-to-end | |
| python src/mcq_builder.py | |
| # Step 5 β Test scoring | |
| python src/evaluator.py | |
| ``` | |
| --- | |
| ## Sample Output | |
| **Input passage (ISRO):** | |
| ``` | |
| The Indian Space Research Organisation (ISRO) was founded in 1969 by Vikram Sarabhai. | |
| ISRO developed India's first satellite, Aryabhata, which was launched in 1975. | |
| The Chandrayaan-1 mission in 2008 discovered water molecules on the Moon. | |
| In 2023, Chandrayaan-3 successfully landed near the lunar south pole. | |
| The Mars Orbiter Mission, also called Mangalyaan, was launched in 2013. | |
| ``` | |
| **Generated questions:** | |
| ``` | |
| Q1: Who founded ISRO? | |
| A. Jawaharlal Nehru | |
| B. APJ Abdul Kalam | |
| C. Vikram Sarabhai β | |
| D. Homi Bhabha | |
| Q2: What was India's first satellite called? | |
| A. Chandrayaan | |
| B. Mangalyaan | |
| C. Rohini | |
| D. Aryabhata β | |
| Q3: When did the Chandrayaan-1 mission take place? | |
| A. 1975 | |
| B. 2013 | |
| C. 2023 | |
| D. 2008 β | |
| Q4: What mission made India the first Asian country to reach Mars orbit? | |
| A. Chandrayaan-3 | |
| B. Aryabhata | |
| C. Mangalyaan β | |
| D. Chandrayaan-1 | |
| ``` | |
| --- | |
| ## What Makes a Good Passage | |
| The system performs best on **factual passages** that contain: | |
| | Works well | Works poorly | | |
| |---|---| | |
| | People names (PERSON entities) | Opinion / descriptive text | | |
| | Specific dates (DATE entities) | Passages with repeated entities | | |
| | Organisation names (ORG entities) | Very short passages (< 5 sentences) | | |
| | Place names (GPE entities) | Abstract/philosophical text | | |
| | One clear fact per sentence | Sentences with multiple facts | | |
| **Best passage types:** History, science, geography, biographies, Wikipedia-style articles | |
| **Avoid:** Opinion pieces, marketing content, descriptive narratives without specific facts | |
| --- | |
| ## Known Limitations | |
| 1. **Passage type dependency** β Works best on factual text. Descriptive or opinion text produces poor questions because there are no named entities to use as answers. | |
| 2. **T5-small quality ceiling** β The model used (`t5-small`) has 60M parameters. Larger models like `t5-base` or `t5-large` would produce better questions but require more memory and time. | |
| 3. **Distractor diversity** β When a passage has few named entities, distractors may fall back to generic options. Fine-tuning a separate T5 model on the RACE dataset for distractor generation would fix this. | |
| 4. **English only** β The current pipeline only supports English text. Extending to Hindi or other Indic languages would require multilingual spaCy models and a multilingual QG model. | |
| 5. **No semantic deduplication** β Two questions from the same passage can sometimes be semantically similar even if worded differently. | |
| --- | |
| ## Future Work | |
| - [ ] Fine-tune a T5 distractor generation model on the RACE dataset (100k exam questions) | |
| - [ ] Add support for Hindi using IndicNLP + multilingual BERT | |
| - [ ] Add PDF upload support so users can quiz themselves on any document | |
| - [ ] BLEU/METEOR/ROUGE automated evaluation of generated questions | |
| - [ ] Difficulty scoring per question based on distractor plausibility | |
| - [ ] Export quiz as PDF for offline use | |
| --- | |
| ## Related Research | |
| Papers that use similar approaches β cited for comparison: | |
| 1. **Automatic Generation of Multiple-Choice Questions (2023)** | |
| Zhang et al. β T5 with pre/postprocessing pipelines for MCQ generation | |
| https://arxiv.org/abs/2303.14576 | |
| 2. **Deep Learning and Linguistic Feature Based Automatic MCQ Generation (Springer, ICDCIT 2022)** | |
| Agarwal et al. β DL + linguistic features for MCQ generation (same 3-step pipeline) | |
| https://link.springer.com/chapter/10.1007/978-3-030-94876-4_18 | |
| 3. **End-to-End MCQ Generation Using T5 (ScienceDirect 2022)** | |
| Rodriguez-Torrealba et al. β Full T5-based pipeline with Wikipedia passages | |
| https://www.sciencedirect.com/science/article/pii/S0957417422014014 | |
| 4. **Leaf β MCQ Generation System (ECIR 2022)** | |
| Vachev et al. β Two fine-tuned T5 models: one for QG, one for DG on RACE | |
| https://github.com/KristiyanVachev/Leaf-Question-Generation | |
| 5. **Automatic Distractor Generation β Systematic Review (PMC 2024)** | |
| Comprehensive review of distractor generation methods including WordNet and T5 | |
| https://pmc.ncbi.nlm.nih.gov/articles/PMC11623049/ | |
| 6. **Automatic Question Generation: A Review (Springer/PMC 2023)** | |
| Mulla & Gharpure β Survey of methodologies, datasets, and evaluation metrics | |
| https://pmc.ncbi.nlm.nih.gov/articles/PMC9886210/ | |
| **What differentiates this project from the above:** | |
| - End-to-end pipeline with interactive quiz UI (most papers only generate questions) | |
| - NER-type-matching distractor strategy (distractors always same entity type as answer) | |
| - Multi-layer quality filtering at both question and MCQ level | |
| - Answer circularity detection (rejects questions where answer appears in the question) | |
| --- | |
| ## Course Outcomes Covered | |
| | CO | Description | How this project covers it | | |
| |---|---|---| | |
| | CO1 | Articulate NLP and word representation | TF-IDF, NER, WordNet, word embeddings all implemented and explained | | |
| | CO2 | Build deep learning models for NLP problems | T5 transformer for QG (seq2seq), beam search decoding, transfer learning | | |
| | CO3 | Implement ML/DL solutions in real context | End-to-end deployable system with Streamlit UI and interactive demo | | |
| --- | |
| ## Author | |
| **[Tanmay Jain]** | |
| [ Bennett University] | |
| --- | |
| *Built with spaCy, HuggingFace Transformers, NLTK, scikit-learn, and Streamlit.* |