Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better Exception handling #25

Open
RichJackson opened this issue Jun 10, 2024 · 0 comments
Open

Better Exception handling #25

RichJackson opened this issue Jun 10, 2024 · 0 comments

Comments

@RichJackson
Copy link
Collaborator

Document.metadata[PROCESSING_EXCEPTION] will get overwritten if a document fails for multiple steps.

We probably want a dictionary from the step namespace to the exception instead, but note we need to communicate that this will change in some release to BIKG.

It would be nice if we could choose at a pipeline level whether exceptions actually get (re-)raised as well. It seems like we could do this either doing one of:

  1. Always (re-)raising in the step, but then having an 'except' clause at the pipeline level that may just log exceptions rather than (re-)raising, depending on how the Pipeline object is configured
  2. Put the 'actual exception' in Document.metadata[PROCESSING_EXCEPTION] rather than the result of traceback.format_exc()
  • Then in the Pipeline, iterate over the documents and raise any exceptions
  • Make the Document serialization format the exception into a string
  1. Maybe a combo of both 1 and 2 above? I think raising in the Step might be better, as if there's a problem that affects all documents, we'll get an exception right away rather than when the whole run has finished. But at the same time, I think actually having the 'real exceptions' would be useful.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant