# Deployment Instructions ## Deploying to Hugging Face Spaces ### Prerequisites - A Hugging Face account (free) - Git installed locally ### Steps 1. **Create a new Space on Hugging Face:** - Go to https://huggingface.co/spaces - Click "Create new Space" - Choose a name (e.g., "ai-text-assistant") - Select "Gradio" as the SDK - Choose visibility (Public or Private) - Click "Create Space" 2. **Clone your Space repository:** ```bash git clone https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME cd YOUR_SPACE_NAME ``` 3. **Copy the application files:** Copy these files from this project to your Space repository: - `app.py` - `requirements.txt` - `README.md` - `.gitignore` (optional) 4. **Commit and push:** ```bash git add . git commit -m "Initial commit: AI Text Assistant" git push ``` 5. **Wait for deployment:** - Hugging Face Spaces will automatically detect the changes - The build process will install dependencies and start the app - This may take 5-10 minutes for the first deployment - You can watch the build logs in the Space's "Logs" tab 6. **Access your app:** - Once deployed, your app will be available at: - `https://huggingface.co/spaces/YOUR_USERNAME/YOUR_SPACE_NAME` ### Local Testing To test locally before deploying: ```bash # Install dependencies pip install -r requirements.txt # Run the app python app.py ``` The app will be available at `http://127.0.0.1:7860` ### Configuration Options #### Hardware For better performance, you can upgrade your Space's hardware: - Go to Space Settings → Hardware - Options include CPU (free), GPU T4 (small fee), GPU A10G, etc. - The app works on CPU but will be faster with GPU #### Environment Variables You can set these in Space Settings → Variables: - `TRANSFORMERS_CACHE`: Custom cache directory for models - `HF_HOME`: Hugging Face home directory ### Troubleshooting **Build fails with memory errors:** - The models are relatively small, but if you encounter issues: - Upgrade to a better hardware tier - Or consider using Hugging Face Inference API instead **App starts slowly:** - The first run downloads models (~1GB for Qwen, ~1.6GB for BART) - Subsequent runs will use cached models - Model loading takes 30-60 seconds on CPU **Token alternatives not showing:** - Make sure you hover over the generated words - The tooltip appears on hover with a slight delay - Try different browsers if issues persist ### Performance Notes - **First Load:** Slow due to model downloads - **Model Loading:** 30-60 seconds on CPU, 5-10 seconds on GPU - **Generation Speed:** - Qwen (0.5B): ~10-20 tokens/sec on CPU, ~100+ tokens/sec on GPU - BART-large: ~5-10 tokens/sec on CPU, ~50+ tokens/sec on GPU ### Support For issues or questions: - Check Hugging Face Spaces documentation: https://huggingface.co/docs/hub/spaces - Open an issue on the repository - Contact: Your email/contact info