Skip to content
This repository has been archived by the owner on Aug 2, 2022. It is now read-only.

[BUG] Fail to resume Abnormal Detector after no/zero traffic period #317

Open
gogovan-vincentngai opened this issue Nov 24, 2020 · 3 comments
Labels
bug Something isn't working

Comments

@gogovan-vincentngai
Copy link

gogovan-vincentngai commented Nov 24, 2020

Describe the bug
I have setup a demo version via docker-compose in my local mac machine
I also have setup nginx with filebeat to send demo log to my elasticsearch
I setup a post man to keep send traffic to the demo nginx

Other plugins installed
N/A

To Reproduce
Steps to reproduce the behavior:

  1. I start the test and inject traffic at 11/23 11:00 to 12:00
  2. I create the Abnormal Detector (with index filebeat* )
  3. At 11/23 12:00 i stopped inject traffic and leave the setup there
  4. After some time later Abnormal Detector shows it Data is not being ingested correctly
  5. 11/24 09:00 i resume the traffic and Abnormal Detector does not resume , keep shows it Data is not being ingested correctly
  6. 11/24 10:09 i stop and start again the Abnormal Detector , it keep Shows Initializing and I can confirm the data are inject into over 30mins

Expected behavior

  • Resume the Abnormal Detector

Screenshots
11/23 12:00 Stop Traffic and Resume Send traffic at 11/24 09:00
螢幕截圖 2020-11-24 上午10 26 59
螢幕截圖 2020-11-24 上午9 32 29

Screen Even after 30mins stop and restart the Detector
螢幕截圖 2020-11-24 上午10 14 27

Desktop (please complete the following information):
N/A

Additional context
https://discuss.opendistrocommunity.dev/t/questions-about-ml/4175/5

@gogovan-vincentngai gogovan-vincentngai added the bug Something isn't working label Nov 24, 2020
@kaituo
Copy link
Member

kaituo commented Dec 7, 2020

sorry for the late reply. A few questions:
First, for step 6, what's your detection interval? When stopping detectors, previous models are erased and new training starts. That's why you see the detector goes back to initializing. We look back last 24 hours data or last 512 historical samples (depending on which one has more data points) for training. If you don't have enough data points (we need at least 128 shingles), then the training uses live data and has to wait for the live data to come. A shingle is a consecutive sequence of the most recent records. For example, a shingle of 8 records associated with an entry corresponds to a vector of the last 8 consecutive records received up to and including the entry. Do you have enough shingles in your history going back from 11/24 10:09?

Second, for step 5, we show the error If latest 3 shingle are missing. Could you check if the anomaly result index has feature data in the last 3 interval? The query will be like:

curl -X GET "localhost:9200/.opendistro-anomaly-results*/_search?pretty" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "filter": {
        "term": {
          "detector_id": "your-detector-id"
        }
      }
    }
  },
  "size": 1,
  "sort": [
    {
      "data_end_time": {
        "order": "desc"
      }
    }
  ]
}
'

@mustardlove
Copy link

mustardlove commented Feb 2, 2021

I'm experiencing the same problem: Data is not being ingested correctly for feature: returnCode

Does this message have something to do with feature configuration?
In my case, I was hoping to detect anomalies based on returnCode(there are above 100 variations) and this is my configuration:

  • returnCode is of type 'text'
  • field: returnCode.keyword
  • Aggregation method: value_count (I selected 'count')
  • Category field: None

When I set category field as returnCode.keyword, that message above didn't show up

@kaituo
Copy link
Member

kaituo commented Feb 3, 2021

Do you have live data? You can verify it by issuing a query against the source index within [now-your detector interval, now].

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

3 participants