PyPDF2 gradio PyMuPDF python-docx scikit-learn sentence-transformers matplotlib textstat pdfplumber openpyxl pdf2image pytesseract