memray flamegraph report (memory leaks)
Python Allocator: pymalloc
Reset

Report generated using --leaks with an arena allocator

This memory leaks report was generated with the pymalloc allocator active, but without tracking enabled for object deallocations. This will show misleading results because the allocator retains memory in memory pools even after the objects that requested that memory are deallocated, and Memray won't be able to distinguish memory set aside for reuse from leaked memory. For a more useful memory leaks report, you should pass the --trace-python-allocators flag when profiling your application. Click here for more information.

Memray run stats
Command line: quantize.py --model /storage/yiliu7/deepseek-ai/DeepSeek-V2-Lite-Chat/ -t nvfp4 --use_autoround_format --output_dir ./qmodels
Start time: 2025-12-10 05:59:20.435000+00:00
End time: 2025-12-10 05:59:51.749000+00:00
Duration: 0:00:31.314000
Total number of allocations: 40271815
Total number of frames seen: 0
Peak memory usage: 16.8 GB
Python allocator: pymalloc
How to interpret flamegraph reports

The flame graph displays stack frames at allocation, for memory that was leaked during the tracking period (i.e. allocated and not deallocated).

Note that the Python allocator doesn't necessarily release memory to the system when Python objects are deallocated and these can still appear as "leaks". If you want to exclude these, you can run your application with the `PYTHONMALLOC=malloc` environment variable set.

The vertical ordering of the stack frames corresponds to the order of function calls, from parent to children. The horizontal ordering does not represent the passage of time in the application: they simply represent child frames in arbitrary order.

On the flame graph, each bar represents a stack frame and shows the code which triggered the memory allocation. Hovering over the frame you can also see the overall memory allocated in the given frame and its children and the number of times allocations have occurred.

The Show/Hide Irrelevant Frames button can be used to reveal and hide frames which contain allocations in code which might not be relevant for the application. These include frames in the CPython eval loop as well as frames introduced by memray during the analysis.

You can find more information in the documentation.

Resident set size over time