Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a plugin for EmailAddress #694

Open
wants to merge 6 commits into
base: master
Choose a base branch
from

Conversation

perryzjc
Copy link
Contributor

@perryzjc perryzjc commented May 16, 2023

  • Please check if the PR fulfills these requirements
  • Tests for the changes have been added
  • Docs have been added / updated
  • All CI checks are green
  • What kind of change does this PR introduce?

This PR introduces a new feature in the form of an Email Address Detector.

  • What is the current behavior?

Currently, the system does not have specific functionality that allows for the detection of email addresses.

  • What is the new behavior (if this is a feature change)?
    With this feature change, the system can now detect email addresses. It uses a regex-based detector to identify valid email addresses and ignores certain whitelist email addresses to reduce false positives. It primarily validates the general format of email addresses but does not adhere strictly to email format standards such as RFC 5322 for the concerns of complexity and efficiency.

  • Does this PR introduce a breaking change?

This PR does not introduce a breaking change. The new detector class EmailAddressDetector is a standalone addition and does not alter or interfere with the existing code base.

  • Other information:
    In addition to the implementation of the Email Address Detector, this PR also includes comprehensive tests to ensure the correct functionality of the detector. It tests the detector against valid and invalid email addresses, and email addresses that are part of a larger string, as well as whitelist and non-whitelist email addresses.

The code documentation has been updated to reflect these changes. The new feature is expected to enhance the system's capabilities in handling email address-related data.

@riverma
Copy link

riverma commented Oct 18, 2023

FYI @KevinHock - thoughts?

@lorenzodb1
Copy link
Contributor

lorenzodb1 commented Oct 18, 2023

Hi @perryzjc, thank you for opening this PR. I noticed our PR checks did not run, so I'll have to figure what happened there first. Rest assured I'll get to this as soon as possible.

@lorenzodb1
Copy link
Contributor

lorenzodb1 commented Nov 16, 2023

@perryzjc hi again 😄 could you please merge master into your branch?

@perryzjc
Copy link
Contributor Author

@perryzjc hi again 😄 could you please merge master into your branch?

@lorenzodb1 hi, I've merged the latest updates from the master branch into my feature branch as requested. Please let me know if there's anything else needed for this pull request. Thanks!

@lorenzodb1
Copy link
Contributor

@perryzjc looks like there's some pre-commit checks failing, please take a look

@perryzjc perryzjc force-pushed the feature-plugin-email_address branch from 51cfcee to 7d4bc67 Compare November 17, 2023 05:33
@perryzjc
Copy link
Contributor Author

@perryzjc looks like there's some pre-commit checks failing, please take a look

@lorenzodb1 Thanks for letting me know. I've resolved the issues and all checks are now passing in the workflow.

Copy link
Contributor

@lorenzodb1 lorenzodb1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think we could add a way for someone to specify a domain so that this plugin flags only emails from that domain?

@perryzjc
Copy link
Contributor Author

Do you think we could add a way for someone to specify a domain so that this plugin flags only emails from that domain?

@lorenzodb1 What if we add an additional argument for the command line to specify the domains and make it the config in the baseline file?

  • Add a --detect-only-domains argument for specifying the domains we want to monitor.
  • Update our .secrets.baseline to include these domain settings.

So, instead of the usual:
$ detect-secrets scan test_data/ --all-files > .secrets.baseline

We'd have something like:
$ detect-secrets scan test_data/ --all-files --detect-only-domains "gmail.com,mail.example.com" > .secrets.baseline

The baseline file would then look like this:

  "plugins_used": [
    ...
    {
      "name": "EmailAddressDetector",
      "detect-only-domains": "gmail.com,mail.example.com"
    },
    ...
  ]

This setup means our pre-commit checks could catch emails based on this configuration, giving us more control.

I’m curious about your take on this. Are there any concerns regarding compatibility or implementation challenges we should be aware of?

@lorenzodb1
Copy link
Contributor

@perryzjc that's exactly what I was thinking. I don't think there would be compatibility issues and I can't think of any implementation challenges you might face. I'll tag @jpdakran as he can maybe come up with something

@riverma
Copy link

riverma commented Mar 1, 2024

Hey @lorenzodb1 @jpdakran - just wanted to poke here as well. Hope @perryzjc contribution can be considered for merge.

@lorenzodb1
Copy link
Contributor

@perryzjc I just got around to review this and I wondering: do you still intend to add the --detect-only-domains flag? I think that'd be a useful one.

@lorenzodb1
Copy link
Contributor

@perryzjc looks like merging #692 created some conflicts in here. I'd ask you to solve these (which will also have the additional benefit of updating the PR checks, meaning it'd run checks for py3.11 too)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants