Julian Rakuschek - 2024-06-03 - TU Graz

Visual Analytics

Time Series

Anomaly Detection

Web Development

AnoScout is the result of my Master's thesis, it is a system for exploring anomalies in a time series dataset. In the following, a tour through the various features is presented, supported by videos of each component.

Upload and Compute

The first step is to upload the data, which is achieved through a dropzone. Next, the user may start inspecting the time series to find one with no anomalies, which is marked as a reference for the models. A time series with anomalies is marked as baseline for normalization purposes. Afterwards, the models are trained and the anomaly scores are computed for each time series in parallel.

Extracting Anomalies

Once the scores are available, the user may compare each algorithm's output and decide on their weight, that is, the contribution to the weighted average. Each interval above a certain threshold is considered an anomaly.

Manual Anomaly Inspection

As soon as the user extracts anomalies, they can be viewed for each time series. The expert may further again disassemble the ensemble score and assess each algorithm's performance.

The Anomaly Recommender

In order to explore all anomalies in the dataset at once, the user may view them in a list sorted by severity of each anomaly. Further, the scatterplot depicts the anomalies arranged by score and length. If the user wants to focus on a specific pattern, a rating can be assigned and the sorting can be switched to a different approach, which implements an item-based collaborative filtering approach. That way, anomalies that are similar to the highly rated anomalies are moved upwards in the list. Further, the scatter plot coloring can be switched to highlight the recommender ranking of anomalies. Additionally, the anomalies can be arranged by dimensionality reduction (MDS, ISOMAP, t-SNE, etc.).

Exploring Patterns through Clustering

The recommender serves the user well if only one specific pattern should be highlighted, but to gain an overview of all patterns, clustering is more meaningful as shown below: