Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add an API endpoint to retrieve structured content preview of an email (subject, from, to, date) #763

Open
Nawhack opened this issue Jan 6, 2025 · 4 comments
Assignees
Labels
good first issue Good for newcomers

Comments

@Nawhack
Copy link

Nawhack commented Jan 6, 2025

Hello,

I would like to request a new feature for the Pandora API. Currently, I have noticed that there is no dedicated API endpoint to retrieve structured content preview of an email, such as the subject, sender (from), recipients (to), date annd the body preview.

While exploring the available endpoints, I found the following:

  • /worker_status?task_id={task-id}&worker_name=preview&details=1: This endpoint seems related to the preview functionality but does not return detailed information about the email content.
  • /task-action/{task-id}/refresh: This endpoint provides some information about the file (eml) but does not include structured email content.

It would be highly beneficial to have an API endpoint that allows eternal api to retrieve structured email data directly. The use case is for a possible integration with phishing analysis workflows that require structured email information.
This functionality could greatly enhance the usability of the framework, especially for tasks that involve email analysis.

Regards

@Rafiot
Copy link
Contributor

Rafiot commented Jan 6, 2025

Hello,

I'm not sure to understand what you mean there. Do you want to be able to get from a remote pandora instance an EmailMessage formatted text response?

Basically what is done there: https://github.com/pandora-analysis/pandora/blob/main/website/web/generic_api.py#L305

But instead of sending the mail to a mailbox directly, you just get the headers + content, in the response to a GET request?

And if this issue is AI generated, please tell me because it makes very little sense as-is.

@Nawhack
Copy link
Author

Nawhack commented Jan 6, 2025

Hello Rafiot,

I'm not sure to understand what you mean there. Do you want to be able to get from a remote pandora instance an EmailMessage formatted text response?

yes, when an email is analyzed by pandora (eml or msg), the need is to integrate in the result endpoint api the same level of information as the pandora GUI in the "Content Preview".

image

Basically what is done there: https://github.com/pandora-analysis/pandora/blob/main/website/web/generic_api.py#L305

But instead of sending the mail to a mailbox directly, you just get the headers + content, in the response to a GET request?

And if this issue is AI generated, please tell me because it makes very little sense as-is.

XD I think it makes sense for the API to offer the same level of detail as the GUI.

Regards

@Rafiot
Copy link
Contributor

Rafiot commented Jan 6, 2025

Ok, I didn't understand at all what you meant, and my answer is completely unrelated.

Your starting point is a mail (eml or msg) and you want a few of the fields from the headers. Using Pandora for that is extremely overkill, and I'd suggest to use https://github.com/TeamMsgExtractor/msg-extractor and https://github.com/GOVCERT-LU/eml_parser , it will be much more efficient.

I can add an endpoint for that (what threw me off is the endpoints you mentioned, as they have nothing to do with what you want to accomplish), but the response will be a json dump, not a email as string: you start from an email that is already in a format you can parse with the libraries above, returning the same thing doesn't makes much sense.

@Rafiot Rafiot added the good first issue Good for newcomers label Jan 6, 2025
@Rafiot Rafiot self-assigned this Jan 6, 2025
@Nawhack
Copy link
Author

Nawhack commented Jan 6, 2025

Thanks for your feedback.

Yes I'm aware about the current existing python libraries for email analysis. I experimented several python scripts before realizing that pandora already embed these features with more. Pandora have the advantage to already offer the deep analysis and to provide an overall result.

Firstly I tested the "task-status" endpoint api but I get no more detail about the analyzed eml file, then I tested the "worker-status" endpoint api with the "all workers" and "details" key to 1 but also with no more detail. (as requested in the generic_api.py " Do you want details about the report status of every worker? 1 for yes, 0 for no")

It will be very interresting to have a "get-report" or "task-report" endpoint API including details detail of the analyzed file and workers.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

2 participants