Timemory: Modular Performance Analysis for HPC
Performance Analysis and Optimization
Programming Models & Languages
TimeWednesday, June 24th9:30am - 10:00am
LocationAnalog 1, 2
DescriptionHPC has undergone a significant transition toward heterogeneous architectures.
This transition has introduced several issues in code migration to support
multiple frameworks for targeting the various architectures. In order to cope with these
challenges, projects such as Kokkos and LLVM create abstractions which map
the front-end API to the backend that supports the targeted architecture.
This paper presents a similar framework for performance measurement and analysis.
Several performance measurement and analysis tools in existence
provide their capabilities through various methods but the common theme
For this reason, valuable analysis methods such as the roofline model are
commonly required to be generated manually. The timemory framework eliminates
all restrictions on user-level extensions and provides a straight-forward
and intuitive method for handling multiple components concurrently.
Timemory components are developed in C++ but includes multi-language
support for C, Fortran, and Python codes.
Numerous components are provided by the library itself -- including, but not
limited to, timers, memory usage, FLOP and instruction roofline models,
hardware counters, external instrumentation marker forwarding -- but there
is no overhead associated with their exclusion from a tool-set specification.
Additionally, analysis of the intrinsic overhead demonstrates superior performance
in a comparison with popular tools.