Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: add regex filter for feature files. #473

Open
Bb4fit opened this issue May 19, 2024 · 1 comment
Open

Feature request: add regex filter for feature files. #473

Bb4fit opened this issue May 19, 2024 · 1 comment

Comments

@Bb4fit
Copy link

Bb4fit commented May 19, 2024

The project needs slight improvement in terms of outputs.
It is best to modify the program so that it only saves content that has been successfully extracted, rather than saving empty text files if there is no extracted content. Among the positives of this:

  • Facilitating and accelerating the process of analyzing the extracted data
  • Save storage space
    This can be achieved by using a simple if statement for each regex.
@simsong
Copy link
Owner

simsong commented May 19, 2024

Thanks for your comments.

  • Traditionally, we left the 0-length feature files so that users could know that a particular scanner ran and found nothing. There is minimal overhead associated with storing zero-length files.
  • Previously, we also stored data in an SQLite3 database, which dramatically improved performance and reduced overhead. However, nobody used it.

Your suggestion of adding a regex filter on each feature file to further prune the output is a curious one. This program has been in use for 14 years and no one has ever suggested this before. It is straighforward to run grep on a feature file; it is not straightforward to re-run bulk_extractor if the there is a typo in the filter.

Do you have an actual use case for which the output size is problematic and a filter is required, or is this a request based on what a hypothetical user would like? If you are indeed in need of this feature, you are welcome to submit it as a pull request. I'm happy to design it with you. Adding more command line switches is problematic at this point, so you might also want to add the ability to have a yaml or JSON configuration file.

If you aren't able to implement this yourself but are willing to pay for this feature to be created, I can hook you up with a consultant.

@simsong simsong changed the title Output only the found content Feature request: add regex filter for feature files. May 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants