Je suis
Citoyen / Grand public
Étudiant / Futur étudiant
Partenaire public
Enseignant / Elève

Easy, Accurate, and Fast Machine Learning on Very Large Time Series Collections: Similarity Search and Subsequence Anomaly Detection


IPGP - Îlot Cuvier


Séminaires de Sismologie

Salle 310

Themis Palpanas

LIPADE, University Paris Cite

There is an increasingly pressing need, by several applications in diverse domains, for developing techniques able to manage and analyze very large collections of sequences, or data series. Examples of such applications come from various monitoring applications, including in power utility companies, where we need to apply machine learning techniques for knowledge extraction. It is not unusual for these applications to involve numbers of data series in the order of hundreds of millions to billions, which are often times not analyzed in their full detail due to their sheer size. However, no existing data management solution can offer native support for sequences and the corresponding operators necessary for complex analytics. In this talk, we describe our efforts in designing techniques for indexing and analyzing truly massive collections of data series that enable scientists to run complex analytics on their data. These techniques are orders of magnitude faster than the state of the art, and are applied on datasets derived from several different disciplines, including seismology. We also present our recent work on (essentially, parameter-free) subsequence anomaly detection and explanation, which is both more accurate and faster than competing approaches.

A lire aussi
Présentation générale  Avant propos Les ondes sismiques offrent une fenêtre d’observation privilégiée de notre planète. La sismologie est une science ...