A quantization setup used for GLM-4.5-Air:
Deploy with TensorRT-LLM
Chat template
Files info
Base model