Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

detect/multi-tenant: Eliminate crashes processing tenant config #9060

Closed
wants to merge 3 commits into from

Conversation

jlucovsky
Copy link
Contributor

Continuation of #9056

This PR eliminates crashes during parallel loading of tenant configurations by serializing processing of the .config files

  • reference.config
  • threshold.config
  • classification.config

Link to redmine ticket: 6047

Describe changes:

  • Formatting fixups
  • Removal of commit that triggered "uninitialized variable" errors.

Provide values to any of the below to override the defaults.

To use a pull request use a branch name like pr/N where N is the pull request number.

SV_REPO=
SV_BRANCH=
SU_REPO=
SU_BRANCH=
LIBHTP_REPO=
LIBHTP_BRANCH=

Issue: 6047

This commit ensures that the tenant id is contained in a unsigned 32 bit
container.
Issue: 6047

This commit serializes processing of individual .config file processing.
This is of little consequence when multi-tenancy is not in use.

When multiple tenants have been configured, serialization will ensure
parallel loading of configuration/rules doesn't concurrently use shared
pcre2 constructs.
This commit changes memory allocations to SCCalloc for clarity.
@codecov
Copy link

codecov bot commented Jun 22, 2023

Codecov Report

Merging #9060 (b8c5728) into master (643e674) will increase coverage by 0.01%.
The diff coverage is 75.86%.

Additional details and impacted files
@@            Coverage Diff             @@
##           master    #9060      +/-   ##
==========================================
+ Coverage   82.32%   82.34%   +0.01%     
==========================================
  Files         969      969              
  Lines      273655   273662       +7     
==========================================
+ Hits       225292   225337      +45     
+ Misses      48363    48325      -38     
Flag Coverage Δ
fuzzcorpus 64.57% <48.27%> (+0.04%) ⬆️
suricata-verify 60.61% <72.41%> (-0.03%) ⬇️
unittests 62.90% <62.06%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

@catenacyber
Copy link
Contributor

S-V : find . -name "*.pcap" -perm +111

@suricata-qa
Copy link

Information: QA ran without warnings.

Pipeline 14764

@victorjulien
Copy link
Member

there are far more call sites for pcre2_substring_get_bynumber, including some at packet time (pcre keyword), so if pcre2_substring_get_bynumber truly isn't thread safe, we need a much bigger patch.

I'm missing a good explanation of the thread safety issue, what is the issue?

Copy link
Member

@victorjulien victorjulien left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment

@jlucovsky
Copy link
Contributor Author

The static regex structures within the reference, threshold, and classification file processing (static pcred2_code *regex and static pcred2_match_data for example) were being used concurrently.

This PR prevents concurrent access -- access was most often through pcre2_substring_copy_bynumber -- by the multiple loader threads used in MT.

@victorjulien
Copy link
Member

The static regex structures within the reference, threshold, and classification file processing (static pcred2_code *regex and static pcred2_match_data for example) were being used concurrently.

This PR prevents concurrent access -- access was most often through pcre2_substring_copy_bynumber -- by the multiple loader threads used in MT.

This type of explanation belongs in the commit.

@victorjulien
Copy link
Member

The static regex structures within the reference, threshold, and classification file processing (static pcred2_code *regex and static pcred2_match_data for example) were being used concurrently.

This PR prevents concurrent access -- access was most often through pcre2_substring_copy_bynumber -- by the multiple loader threads used in MT.

If this is the case I would suggest a different solution: move these pcre2 data structures into DetectEngineCtx. Of these we have an instance per tenant.

Copy link
Contributor

@catenacyber catenacyber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As Victor said

If this is the case I would suggest a different solution: move these pcre2 data structures into DetectEngineCtx. Of these we have an instance per tenant.

By the way, I like the commits to use SCCalloc and make tenant_id always a uint32 :-)

@jlucovsky jlucovsky closed this Jul 17, 2023
@catenacyber
Copy link
Contributor

I think this was solved with #9243

@jlucovsky jlucovsky deleted the 6047/2 branch July 18, 2023 12:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants