810proj / README.md
jscmp4's picture
Update README.md
4fd2fc8 verified
metadata
title: Journal Authority Auditor
emoji: 🛡️
colorFrom: blue
colorTo: indigo
sdk: docker
app_port: 7860

Journal Authority Auditor Agent 🛡️

🔗 Live App: https://huggingface.co/spaces/jscmp4/810proj

1. What the code does

This project implements an autonomous AI Agent designed to audit the academic authority of journals and research papers. It serves as a "Glass Box" tool for researchers to verify credibility instantly.

Key capabilities include:

  • Hybrid RAG Architecture: Combines strict database lookups (MongoDB with 31,000+ Scimago records) with the semantic reasoning of Large Language Models (GPT-4o).
  • Intelligent Routing: Automatically determines whether to search by DOI (using OpenAlex API) or by Journal Name.
  • Fail-Safe Reasoning: If a journal is not found in the verified database, the agent falls back to its internal parametric knowledge to assess the publisher's reputation (e.g., IEEE, ACM) and provide a reasoned risk assessment.
  • Real-Time "Thinking" Logs: A dual-pane interface displays the agent's Chain-of-Thought (CoT), showing exactly which tools are being called and what data is retrieved, ensuring transparency.

2. Structure of the code

The project follows a containerized micro-framework structure powered by Flask and Docker.

File Breakdown:

  • app.py: The core application logic containing:
    • Frontend: A responsive HTML/JS/CSS interface rendered via Flask templates. It handles the dual-pane layout (Chat UI + Terminal Log) and Markdown rendering.
    • Backend API (/chat): Handles POST requests and orchestrates the agent loop.
    • Agent Logic (run_agent_with_logs): Implements a while loop that allows the LLM to autonomously call tools multiple times (Reasoning -> Acting -> Observation) before generating a final answer.
    • Tools:
      • fetch_metadata: Connects to OpenAlex API to resolve DOIs and identify publishers.
      • check_ranking: Connects to MongoDB Atlas to retrieve verified metrics (SJR Quartile, H-Index, Citation rates).
  • GenAI.ipynb: [Database Maintenance] A Jupyter Notebook used for backend data engineering. It handles:
    • Fetching the latest SJR rankings CSV.
    • Cleaning data (handling Euro-style formats).
    • Upserting cleaned records into the MongoDB cloud database.
  • Dockerfile: Defines the Python 3.9 environment, installs dependencies, creates a non-root user for security, and exposes port 7860.
  • requirements.txt: Lists dependencies (flask, openai, pymongo, requests, pyngrok).

3. How to prepare to run

The application is containerized and requires specific API keys to function.

Environment Variables (Secrets)

To run this code, the following environment variables must be set (in Hugging Face Settings or a local .env file):

  • OPENAI_API_KEY: Required for the Agent's reasoning capabilities (GPT-4o).
  • MONGO_USER & MONGO_PASS: Credentials for the MongoDB Atlas Cloud Database.
  • MONGO_CLUSTER: The address of the MongoDB cluster.

Dependencies

No local preparation is needed if accessing via the Hugging Face Web Interface. For local development, Python 3.9+ is required.

4. How to run

Method A: Online (Recommended for Grading)

Simply click the "App" tab at the top of this Hugging Face Space or visit: https://huggingface.co/spaces/jscmp4/810proj

The application is pre-deployed and running 24/7.

Method B: Local Execution (Docker)

  1. Clone the repository:
    git clone [https://huggingface.co/spaces/jscmp4/810proj](https://huggingface.co/spaces/jscmp4/810proj)
    cd 810proj
    
  2. Build the Docker image:
    docker build -t journal-auditor .
    
  3. Run the container (Injecting your API keys):
    docker run -p 7860:7860 -e OPENAI_API_KEY="sk-..." -e MONGO_PASS="..." journal-auditor
    
  4. Access: Open http://localhost:7860 in your browser.

Project submitted for CS810.