abalone_chat_application / RAG_APP_README.md
cmd0160's picture
Adding auto embedding
698ce25

A newer version of the Streamlit SDK is available: 1.56.0

Upgrade

Abalone RAG Chatbot

This project implements a Retrieval-Augmented Generation (RAG) chatbot about Abalone using LangChain + OpenAI with a Streamlit frontend. It's designed to be deployed on Hugging Face Spaces.

Contents

  • app.py - Streamlit app entrypoint
  • src/ingest.py - Ingest files from data/ into a persisted Chroma vectorstore
  • src/vectorstore.py - Helpers to build/load the Chroma vectorstore and return a retriever
  • src/qa_chain.py - Build the conversational retrieval QA chain
  • data/ - Put Abalone source files here (CSV/MD/TXT/PDF)
  • vectorstore/ - Persisted vectorstore directory (created by ingestion)

Quickstart (local)

  1. Create a venv and install dependencies:
python -m venv .venv
source .venv/bin/activate
pip install -r requirements.txt
  1. Set your OpenAI API key:
export OPENAI_API_KEY="sk-..."
  1. Add Abalone files into data/ (for example abalone.csv).

  2. Build the vectorstore:

python -m src.ingest --data-dir ./data --persist-dir ./vectorstore
  1. Run the Streamlit app:
streamlit run app.py

Deploying to Hugging Face Spaces

  • Add OPENAI_API_KEY in the Spaces secrets (Settings -> Secrets).
  • Push this repository to your HF Space. HF will install requirements.txt and run the Streamlit app.
  • On first run, click the "Ingest data" button or allow the app to rebuild the index.

Security

  • Do NOT commit your OpenAI API key. Use HF Spaces Secrets for deployment.

License

  • MIT