quantize.py --model /storage/yiliu7/deepseek-ai/DeepSeek-V2-Lite-Chat/ -t nvfp4 --use_autoround_format --output_dir ./qmodelsThe table reporter provides a simple tabular representation of memory allocations in the target when the memory usage was at its peak.
You can find more information in the documentation.