Hans Meuer Award Finalists
Research Paper
Time Series Mining at Petascale Performance
Event Type
Hans Meuer Award Finalists
Research Paper
Big Data Analytics
Parallel Algorithms
TimeMonday, June 22nd1:50pm - 2:30pm
LocationPanorama 1
DescriptionAbstract. The mining of time series data plays an important role in modern information retrieval and analysis systems. In particular, the identification of similarities within and across time series has garnered significant attention and effort over the last years. For this task, the class of Matrix Profile algorithms, which create a generic structure that en- codes correlations among records and dimensions – the Matrix Profile – is a promising approach, as it allows simplified post-processing and anal- ysis steps by examining the resulting Matrix Profile structure. However, it is expensive to create a Matrix Profile: it requires significant compu- tational power to evaluate the distance among all subsequence pairs in a time series - especially for very long and high-dimensional time-series. Existing approaches are limited in their scalability, as they do not target High Performance Computing systems, and – for most realistic problems – are suited only for datasets with small dimensionality.
In this paper, we introduce a novel MPI-based approach for the calcu- lation of a Matrix Profile for multi-dimensional time series that pushes these limits. We evaluate the efficiency of our approach using an ana- lytical performance model combined with experimental data. Finally, we demonstrate our solution on a 128 dimensional time series dataset of 1 million records, solving 274 trillion sorts at a sustained 1.3 petaflop/s performance and show that our approach can create the matching matrix profile on the xxx system.
Research Paper Authors
Phd candidate
Head of scientific staffs
Phd Candidate