Skip to content

Commit

Permalink
[8.15] [Detection Engine] Fix flake in ML Rule Cypress tests (#188164) (
Browse files Browse the repository at this point in the history
#188483)

# Backport

This will backport the following commits from `main` to `8.15`:
- [[Detection Engine] Fix flake in ML Rule Cypress tests
(#188164)](#188164)

<!--- Backport version: 9.4.3 -->

### Questions ?
Please refer to the [Backport tool
documentation](https://github.com/sqren/backport)

<!--BACKPORT [{"author":{"name":"Ryland
Herrick","email":"[email protected]"},"sourceCommit":{"committedDate":"2024-07-16T19:21:13Z","message":"[Detection
Engine] Fix flake in ML Rule Cypress tests (#188164)\n\nThis API call
was found to be sporadically failing in #182183. This\r\napplies the
same changes made in #188155, but for Cypress tests instead\r\nof
FTR.\r\n\r\nSince none of the cypress tests are currently skipped, this
PR just\r\nserves to add robustness to the suite, which performs nearly
identical\r\nsetup to that of the FTR tests. I think the biggest
difference is how\r\noften these tests are run vs FTRs. Combined with
the low failure rate\r\nfor the underlying issue, cypress's
auto-retrying may smooth over many\r\nof these failures when they
occur.\r\n\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [ ] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n- [ ] [Detection Engine Cypress -
ESS
x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6530)\r\n-
[ ] [Detection Engine Cypress - Serverless
x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6531)","sha":"ed934e3253b47a6902904633530ec181037d4946","branchLabelMapping":{"^v8.16.0$":"main","^v(\\d+).(\\d+).\\d+$":"$1.$2"}},"sourcePullRequest":{"labels":["release_note:skip","Feature:Detection
Rules","Feature:ML Rule","Feature:Security ML Jobs","Feature:Rule
Creation","backport:prev-minor","Team:Detection Engine","Feature:Rule
Edit","v8.16.0"],"title":"[Detection Engine] Fix flake in ML Rule
Cypress
tests","number":188164,"url":"https://github.com/elastic/kibana/pull/188164","mergeCommit":{"message":"[Detection
Engine] Fix flake in ML Rule Cypress tests (#188164)\n\nThis API call
was found to be sporadically failing in #182183. This\r\napplies the
same changes made in #188155, but for Cypress tests instead\r\nof
FTR.\r\n\r\nSince none of the cypress tests are currently skipped, this
PR just\r\nserves to add robustness to the suite, which performs nearly
identical\r\nsetup to that of the FTR tests. I think the biggest
difference is how\r\noften these tests are run vs FTRs. Combined with
the low failure rate\r\nfor the underlying issue, cypress's
auto-retrying may smooth over many\r\nof these failures when they
occur.\r\n\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [ ] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n- [ ] [Detection Engine Cypress -
ESS
x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6530)\r\n-
[ ] [Detection Engine Cypress - Serverless
x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6531)","sha":"ed934e3253b47a6902904633530ec181037d4946"}},"sourceBranch":"main","suggestedTargetBranches":[],"targetPullRequestStates":[{"branch":"main","label":"v8.16.0","branchLabelMappingKey":"^v8.16.0$","isSourceBranch":true,"state":"MERGED","url":"https://github.com/elastic/kibana/pull/188164","number":188164,"mergeCommit":{"message":"[Detection
Engine] Fix flake in ML Rule Cypress tests (#188164)\n\nThis API call
was found to be sporadically failing in #182183. This\r\napplies the
same changes made in #188155, but for Cypress tests instead\r\nof
FTR.\r\n\r\nSince none of the cypress tests are currently skipped, this
PR just\r\nserves to add robustness to the suite, which performs nearly
identical\r\nsetup to that of the FTR tests. I think the biggest
difference is how\r\noften these tests are run vs FTRs. Combined with
the low failure rate\r\nfor the underlying issue, cypress's
auto-retrying may smooth over many\r\nof these failures when they
occur.\r\n\r\n\r\n### Checklist\r\n\r\n- [x] [Unit or
functional\r\ntests](https://www.elastic.co/guide/en/kibana/master/development-tests.html)\r\nwere
updated or added to match the most common scenarios\r\n- [ ] [Flaky
Test\r\nRunner](https://ci-stats.kibana.dev/trigger_flaky_test_runner/1)
was\r\nused on any tests changed\r\n- [ ] [Detection Engine Cypress -
ESS
x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6530)\r\n-
[ ] [Detection Engine Cypress - Serverless
x\r\n200](https://buildkite.com/elastic/kibana-flaky-test-suite-runner/builds/6531)","sha":"ed934e3253b47a6902904633530ec181037d4946"}}]}]
BACKPORT-->

Co-authored-by: Ryland Herrick <[email protected]>
  • Loading branch information
kibanamachine and rylnd authored Jul 16, 2024
1 parent 93a35a4 commit 7670d26
Show file tree
Hide file tree
Showing 3 changed files with 23 additions and 5 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ import {
SUPPRESS_MISSING_FIELD,
} from '../../../../screens/rule_details';
import {
executeSetupModuleRequest,
forceStartDatafeeds,
forceStopAndCloseJob,
setupMlModulesWithRetry,
} from '../../../../support/machine_learning';
import {
continueFromDefineStep,
Expand Down Expand Up @@ -94,7 +94,7 @@ describe(
describe('when ML jobs have run', () => {
before(() => {
cy.task('esArchiverLoad', { archiveName: '../auditbeat/hosts', type: 'ftr' });
executeSetupModuleRequest({ moduleName: 'security_linux_v3' });
setupMlModulesWithRetry({ moduleName: 'security_linux_v3' });
forceStartDatafeeds({ jobIds: [jobId] });
cy.task('esArchiverLoad', { archiveName: 'anomalies', type: 'ftr' });
});
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,9 @@ import {
SUPPRESS_MISSING_FIELD,
} from '../../../../screens/rule_details';
import {
executeSetupModuleRequest,
forceStartDatafeeds,
forceStopAndCloseJob,
setupMlModulesWithRetry,
} from '../../../../support/machine_learning';
import { editFirstRule } from '../../../../tasks/alerts_detection_rules';
import { deleteAlertsAndRules } from '../../../../tasks/api_calls/common';
Expand Down Expand Up @@ -71,7 +71,7 @@ describe(
login();
deleteAlertsAndRules();
cy.task('esArchiverLoad', { archiveName: '../auditbeat/hosts', type: 'ftr' });
executeSetupModuleRequest({ moduleName: 'security_linux_v3' });
setupMlModulesWithRetry({ moduleName: 'security_linux_v3' });
forceStartDatafeeds({ jobIds: [jobId] });
cy.task('esArchiverLoad', { archiveName: 'anomalies', type: 'ftr' });
});
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,7 @@
* 2.0.
*/

import { recurse } from 'cypress-recurse';
import { ML_GROUP_ID } from '@kbn/security-solution-plugin/common/constants';
import { rootRequest } from '../tasks/api_calls/common';

Expand All @@ -16,7 +17,7 @@ import { rootRequest } from '../tasks/api_calls/common';
* @returns the response from the setup module request
*/
export const executeSetupModuleRequest = ({ moduleName }: { moduleName: string }) =>
rootRequest({
rootRequest<{ jobs: Array<{ success: boolean; error?: { status: number } }> }>({
headers: {
'elastic-api-version': 1,
},
Expand All @@ -33,6 +34,23 @@ export const executeSetupModuleRequest = ({ moduleName }: { moduleName: string }
},
});

/**
*
* Calls {@link executeSetupModuleRequest} until all jobs in the module are
* successfully set up.
* @param moduleName the name of the ML module to set up
* @returns the response from the setup module request
*/
export const setupMlModulesWithRetry = ({ moduleName }: { moduleName: string }) =>
recurse(
() => executeSetupModuleRequest({ moduleName }),
(response) =>
response.body.jobs.every(
(job) => job.success || (job.error?.status && job.error.status < 500)
),
{ delay: 1000 }
);

/**
*
* Calls the internal ML Jobs API to force start the datafeeds for the given job IDs. Necessary to get them in the "started" state for the purposes of the detection engine
Expand Down

0 comments on commit 7670d26

Please sign in to comment.