• Migrate Docker image to official llama.cpp CUDA image.
  • Re-write app.py to OOP to re-design methods signatures.
  • Added additional llama-quantize options: --token-embedding-type, --leave-output-tensor, --output-tensor-type
  • Customizable output options: repo name, file name
  • Upload to different quants to the same repository.
  • Updated imatrix training file to calibration_data_v5_rc.txt.
olegshulyakov changed pull request status to open
Cannot merge
This branch has merge conflicts in the following files:
  • app.py

Sign up or log in to comment