Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat/show_anomalies for multivariate #2544

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

cnhwl
Copy link
Contributor

@cnhwl cnhwl commented Sep 27, 2024

Checklist before merging this PR:

  • Mentioned all issues that this PR fixes or addresses.
  • Summarized the updates of this PR under Summary.
  • Added an entry under Unreleased in the Changelog.

Fixes #2114.

Summary

I determine whether this feature is enabled by adding parameter multivariate_plot: bool = False to show_anomalies(), which is implemented in the show_anomalies_from_scores function.

My general idea is to iterate through the components in the series and separately plot each component (including series, pred_series, pred_scores and anomalies). The following is a simple example, with the output shown below:

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd

from darts import TimeSeries
from darts.ad.utils import (
    eval_metric_from_scores,
    show_anomalies_from_scores,
)
from darts.ad import (
    ForecastingAnomalyModel,
    NormScorer,
    WassersteinScorer,
)
from darts.models import RegressionModel

def generate_data_ex1(random_state: int):
    np.random.seed(random_state)

    # create the train set using standard normal distribution
    comp1 = np.expand_dims(np.random.normal(loc=0, scale=1, size=200), axis=1)
    comp2 = np.expand_dims(np.random.normal(loc=0, scale=1, size=200), axis=1)
    
    # Calculate means and standard deviations
    mean1, std1 = np.mean(comp1), np.std(comp1)
    mean2, std2 = np.mean(comp2), np.std(comp2)
    
    # Identify anomalies
    anomalies1 = (comp1 > mean1 + 2 * std1).astype(int)
    anomalies2 = (comp2 > mean2 + 2 * std2).astype(int)
    
    # Concatenate anomalies
    anomalies = np.concatenate([anomalies1, anomalies2], axis=1)
    
    # Concatenate the original values
    vals = np.concatenate([comp1, comp2], axis=1)
    
    return vals, anomalies

# Example usage
data, anomalies = generate_data_ex1(random_state=42)

series = TimeSeries.from_values(data, columns=["comp1", "comp2"])
series_train = series[:120]
series_test = series[120:]
anomalies_series = TimeSeries.from_values(anomalies, columns=["comp1_anomalies", "comp2_anomalies"])
anomalies_series_test = anomalies_series[120:]

anomaly_model = ForecastingAnomalyModel(
    model=RegressionModel(lags=10),
    scorer=[
        NormScorer(component_wise=True),
        WassersteinScorer(component_wise=True)
    ],
)

anomaly_model.fit(series_train, allow_model_training=True, verbose=True)

anomaly_model.show_anomalies(
    series=series_test,
    anomalies=anomalies_series_test,
    metric="AUC_ROC",
    multivariate_plot=True
)

output

I would appreciate more input on how to further improve this feature, thank you very much!

Copy link

codecov bot commented Sep 27, 2024

Codecov Report

Attention: Patch coverage is 39.36170% with 57 lines in your changes missing coverage. Please review.

Project coverage is 93.51%. Comparing base (b41be28) to head (e504930).
Report is 1 commits behind head on master.

Files with missing lines Patch % Lines
darts/ad/utils.py 39.36% 57 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master    #2544      +/-   ##
==========================================
- Coverage   93.86%   93.51%   -0.35%     
==========================================
  Files         139      139              
  Lines       14855    14895      +40     
==========================================
- Hits        13943    13929      -14     
- Misses        912      966      +54     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@cnhwl cnhwl changed the title Add new feature to plot each series's component separately Feat/show_anomalies-for-multivariate Sep 27, 2024
@cnhwl cnhwl changed the title Feat/show_anomalies-for-multivariate Feat/show_anomalies for multivariate Sep 27, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

show_anomalies for multivariate
1 participant