Skip to content
This repository has been archived by the owner on Feb 7, 2024. It is now read-only.

Improve regexes for parsing problem data #409

Merged
merged 1 commit into from
Dec 16, 2020
Merged

Improve regexes for parsing problem data #409

merged 1 commit into from
Dec 16, 2020

Conversation

mgrabovsky
Copy link
Contributor

  • Replace ranges with PCRE escapes, e.g. a-zA-Z\w.
  • Fix package name (NEV) parser to correctly handle epoch numbers. The epoch was previously assumed to be a prefix of the whole name (in ENV form), e.g. 1:findutils-4.7.0-4.fc33. In order to be consistent with abrt, DNF and RPM, we switch to NEV form, that is findutils-1:4.7.0-4.fc33.

Potentially related: #400

* Replace ranges with PCRE escapes, e.g. `a-zA-Z` → `\w`.
* Fix package name (NEV) parser to correctly handle epoch numbers.
  The epoch was previously assumed to be a prefix of the whole name
  (in ENV form), e.g. `1:findutils-4.7.0-4.fc33`. In order to be
  consistent with abrt, DNF and RPM, we switch to NEV form, that is
  `findutils-1:4.7.0-4.fc33`.
@packit-as-a-service
Copy link

Congratulations! One of the builds has completed. 🍾

⚠️ Please note that our current plans include removal of these comments in the near future (at least 2 weeks after including this disclaimer), if you have serious concerns regarding their removal or would like to continue receiving them please reach out to us. ⚠️

You can install the built RPMs by following these steps:

  • sudo yum install -y dnf-plugins-core on RHEL 8
  • sudo dnf install -y dnf-plugins-core on Fedora
  • dnf copr enable packit/abrt-retrace-server-409
  • And now you can install the packages.

Please note that the RPMs should be used only in a testing environment.

@xsuchy
Copy link
Member

xsuchy commented Dec 16, 2020

I think we can do that as we run under en locale. But beware that e.g. in Swedish locale the "w" is not part of \w

@xsuchy xsuchy merged commit 84d67af into master Dec 16, 2020
@mgrabovsky
Copy link
Contributor Author

I think we can do that as we run under en locale. But beware that e.g. in Swedish locale the "w" is not part of \w

Ah, I didn't think about that. But now that I'm looking into the documentation\w includes “most characters that can be part of a word in any language, as well as numbers and the underscore”. So we should be fine regarding locales (although using the ASCII flag might be safer), but we're now accepting more than the original parser.

The change might have been too hasty. I will go and fix that.

@mgrabovsky mgrabovsky deleted the parser-regexes branch December 17, 2020 08:33
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants