Update README.md
Browse files
README.md
CHANGED
|
@@ -1,11 +1,78 @@
|
|
| 1 |
---
|
| 2 |
-
title:
|
| 3 |
-
emoji:
|
| 4 |
-
colorFrom:
|
| 5 |
-
colorTo:
|
| 6 |
sdk: docker
|
| 7 |
-
|
| 8 |
-
short_description: this is for 810proj LLM agent
|
| 9 |
---
|
| 10 |
|
| 11 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 1 |
---
|
| 2 |
+
title: Journal Authority Auditor
|
| 3 |
+
emoji: 🛡️
|
| 4 |
+
colorFrom: blue
|
| 5 |
+
colorTo: indigo
|
| 6 |
sdk: docker
|
| 7 |
+
app_port: 7860
|
|
|
|
| 8 |
---
|
| 9 |
|
| 10 |
+
# Journal Authority Auditor Agent 🛡️
|
| 11 |
+
|
| 12 |
+
**🔗 Live App:** [https://huggingface.co/spaces/jscmp4/810proj](https://huggingface.co/spaces/jscmp4/810proj)
|
| 13 |
+
|
| 14 |
+
## 1. What the code does
|
| 15 |
+
This project implements an autonomous **AI Agent** designed to audit the academic authority of journals and research papers. It serves as a "Glass Box" tool for researchers to verify credibility instantly.
|
| 16 |
+
|
| 17 |
+
Key capabilities include:
|
| 18 |
+
* **Hybrid RAG Architecture:** Combines strict database lookups (MongoDB with 31,000+ Scimago records) with the semantic reasoning of Large Language Models (GPT-4o).
|
| 19 |
+
* **Intelligent Routing:** Automatically determines whether to search by **DOI** (using OpenAlex API) or by **Journal Name**.
|
| 20 |
+
* **Fail-Safe Reasoning:** If a journal is not found in the verified database, the agent falls back to its internal parametric knowledge to assess the publisher's reputation (e.g., IEEE, ACM) and provide a reasoned risk assessment.
|
| 21 |
+
* **Real-Time "Thinking" Logs:** A dual-pane interface displays the agent's **Chain-of-Thought (CoT)**, showing exactly which tools are being called and what data is retrieved, ensuring transparency.
|
| 22 |
+
|
| 23 |
+
## 2. Structure of the code
|
| 24 |
+
The project follows a containerized micro-framework structure powered by **Flask** and **Docker**.
|
| 25 |
+
|
| 26 |
+
### File Breakdown:
|
| 27 |
+
* **`app.py`**: The core application logic containing:
|
| 28 |
+
* **Frontend**: A responsive HTML/JS/CSS interface rendered via Flask templates. It handles the dual-pane layout (Chat UI + Terminal Log) and Markdown rendering.
|
| 29 |
+
* **Backend API (`/chat`)**: Handles POST requests and orchestrates the agent loop.
|
| 30 |
+
* **Agent Logic (`run_agent_with_logs`)**: Implements a `while` loop that allows the LLM to autonomously call tools multiple times (Reasoning -> Acting -> Observation) before generating a final answer.
|
| 31 |
+
* **Tools**:
|
| 32 |
+
* `fetch_metadata`: Connects to **OpenAlex API** to resolve DOIs and identify publishers.
|
| 33 |
+
* `check_ranking`: Connects to **MongoDB Atlas** to retrieve verified metrics (SJR Quartile, H-Index, Citation rates).
|
| 34 |
+
* **`GenAI.ipynb`**: **[Database Maintenance]** A Jupyter Notebook used for backend data engineering. It handles:
|
| 35 |
+
* Fetching the latest SJR rankings CSV.
|
| 36 |
+
* Cleaning data (handling Euro-style formats).
|
| 37 |
+
* Upserting cleaned records into the MongoDB cloud database.
|
| 38 |
+
* **`Dockerfile`**: Defines the Python 3.9 environment, installs dependencies, creates a non-root user for security, and exposes port 7860.
|
| 39 |
+
* **`requirements.txt`**: Lists dependencies (`flask`, `openai`, `pymongo`, `requests`, `pyngrok`).
|
| 40 |
+
|
| 41 |
+
## 3. How to prepare to run
|
| 42 |
+
The application is containerized and requires specific API keys to function.
|
| 43 |
+
|
| 44 |
+
### Environment Variables (Secrets)
|
| 45 |
+
To run this code, the following environment variables must be set (in Hugging Face Settings or a local `.env` file):
|
| 46 |
+
* `OPENAI_API_KEY`: Required for the Agent's reasoning capabilities (GPT-4o).
|
| 47 |
+
* `MONGO_USER` & `MONGO_PASS`: Credentials for the MongoDB Atlas Cloud Database.
|
| 48 |
+
* `MONGO_CLUSTER`: The address of the MongoDB cluster.
|
| 49 |
+
|
| 50 |
+
### Dependencies
|
| 51 |
+
No local preparation is needed if accessing via the Hugging Face Web Interface. For local development, Python 3.9+ is required.
|
| 52 |
+
|
| 53 |
+
## 4. How to run
|
| 54 |
+
|
| 55 |
+
### Method A: Online (Recommended for Grading)
|
| 56 |
+
Simply click the **"App"** tab at the top of this Hugging Face Space or visit:
|
| 57 |
+
[https://huggingface.co/spaces/jscmp4/810proj](https://huggingface.co/spaces/jscmp4/810proj)
|
| 58 |
+
|
| 59 |
+
The application is pre-deployed and running 24/7.
|
| 60 |
+
|
| 61 |
+
### Method B: Local Execution (Docker)
|
| 62 |
+
1. **Clone the repository:**
|
| 63 |
+
```bash
|
| 64 |
+
git clone [https://huggingface.co/spaces/jscmp4/810proj](https://huggingface.co/spaces/jscmp4/810proj)
|
| 65 |
+
cd 810proj
|
| 66 |
+
```
|
| 67 |
+
2. **Build the Docker image:**
|
| 68 |
+
```bash
|
| 69 |
+
docker build -t journal-auditor .
|
| 70 |
+
```
|
| 71 |
+
3. **Run the container** (Injecting your API keys):
|
| 72 |
+
```bash
|
| 73 |
+
docker run -p 7860:7860 -e OPENAI_API_KEY="sk-..." -e MONGO_PASS="..." journal-auditor
|
| 74 |
+
```
|
| 75 |
+
4. **Access:** Open `http://localhost:7860` in your browser.
|
| 76 |
+
|
| 77 |
+
---
|
| 78 |
+
*Project submitted for CS810.*
|