Detection and classification of volcano-seismic signals of la Soufrière de Guadeloupe volcano by machine learning
Start: 01 October 2018
End: 14 April 2022
Jean-Philippe Metaxian, Eléonore Stutzmann, Jerôme Mars
Seismic activity at La Soufrière volcano of Guadeloupe is composed of various transient signals, which are classified manually by the Observatoire Volcanologique et Sismologique de Guadeloupe (OVSG-IPGP) considering waveforms recorded at several stations. Three main classes readily distinguishable on seismic traces during the daily analytical protocol have been catalogued: Volcano-Tectonic events, Long-Period events and Nested events, each related to a distinct physical process.
Automatic detection and classification of seismo-volcanic signals of La Soufrière was performed by using an architecture based on supervised learning, available at github.com/malfante/AAA. Seismic waveforms are transformed into a large set of features (34 features for each representation domain) computed from three representation domain of the signal (time, frequency, quefrency). The resulting vectors of features are then used for modeling. We are using the Random Forest Classifier algorithm from the scikit-learn library.
At first, we trained the model with the dataset given by the OVSG consisting of 845 available labeled events (542 VT, 217 nested and 86 LP) recorded in the period 2013-2018. We obtained an average classification rate of 72%. We determined that the VT class includes a variety of signals covering the LP, Nested and VT classes. After reviewing in detail waveforms and spectral characteristics of the signals belonging to the 3 classes we decided to introduce 2 new classes, Hybrid events and also a monochromatic class (so-called Tornillo) of LP signals, thus matching the full description of signals provided in Moretti et al. (2020).
Then, using the new information, a new model was trained with 5 new classes. We obtained a much better classification average rate of 84%. The classification is excellent for Nested events (93% of recall and precision) and Tornillo events (93% of recall and precision). The classification of VT events (90% recall, 89% precision) and LP events (86% recall, 82% precision) were also very good. The most difficult class to recognize is the Hybrid class (64% recall, 69% precision). Hybrid events are often mixed with VT and LP events. This may be explained by the nature of this class and the physical process that includes both a fracturing and a resonating component with different modal frequencies.
Moreover, by using a supervised machine learning model to distinguish the events from the background noise, we were able to detect three times more events as the observatory with an STA / LTA method.
Machine learning is a powerful tool to handle large datasets. We were able to improve the classification, correct some misclassification and detect more events.