.git .github venv __pycache__ *.pyc .DS_Store uploads scripts README.md LICENSE data/training_data.json data/sample_documents