pdfplumber sentence_transformers cnocr==2.2.3.1 langchain unstructured==0.7.0 pinecone-client openai tiktoken tabulate