Spaces:
Sleeping
Sleeping
A newer version of the Streamlit SDK is available: 1.56.0
Abalone RAG Chatbot
This project implements a Retrieval-Augmented Generation (RAG) chatbot about Abalone using LangChain + OpenAI with a Streamlit frontend. It's designed to be deployed on Hugging Face Spaces.
Contents
app.py- Streamlit app entrypointsrc/ingest.py- Ingest files fromdata/into a persisted Chroma vectorstoresrc/vectorstore.py- Helpers to build/load the Chroma vectorstore and return a retrieversrc/qa_chain.py- Build the conversational retrieval QA chaindata/- Put Abalone source files here (CSV/MD/TXT/PDF)vectorstore/- Persisted vectorstore directory (created by ingestion)
Quickstart (local)
- Create a venv and install dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
- Set your OpenAI API key:
export OPENAI_API_KEY="sk-..."
Add Abalone files into
data/(for exampleabalone.csv).Build the vectorstore:
python -m src.ingest --data-dir ./data --persist-dir ./vectorstore
- Run the Streamlit app:
streamlit run app.py
Deploying to Hugging Face Spaces
- Add
OPENAI_API_KEYin the Spaces secrets (Settings -> Secrets). - Push this repository to your HF Space. HF will install
requirements.txtand run the Streamlit app. - On first run, click the "Ingest data" button or allow the app to rebuild the index.
Security
- Do NOT commit your OpenAI API key. Use HF Spaces Secrets for deployment.
License
- MIT