-
Notifications
You must be signed in to change notification settings - Fork 306
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements #304: Adds deprecated-bugs.csv and refactors commit-db into active-bugs.csv #312
Conversation
@Greg4cr, thanks for implementing this. Can you please define the file names in Constants.pm as well and make sure that every script uses the constants? Linking to these constants in the pod (as opposed to using the file name) would also be good to avoid inconsistencies in case these file names change. |
@rjust - I have pushed commits that:
I looked at the use of system calls to tail in the bug mining framework. There is only one use of tail in the bug mining framework, in |
Thanks @Greg4cr for implementing this.
How about we keep |
@jose Haha, oops! I should proofread the comments more thoroughly. This change came out of discussions @rjust and I had earlier this week. We discussed the concerns you mentioned. At the time, this is what we decided on:
Something that Rene and I discussed, and that I certain agree with, is that internal changes to Defects4J can and should go forward even if it would impact external scripts. Even if we didn't make this particular change, other changes to Defects4J will break someone's scripts. We shouldn't let that prevent changes to the framework. I'm certainly willing to discuss alternative implementations, though. @rjust - thoughts? |
I have strong opinions about improving the internals of the framework such as separating active and deprecated bugs, consistent use of headers, consistent naming, removing code clones, etc., but I do agree that backward compatibility is a concern to keep in mind. IMO, the problem with commit-db (and why this file is used in so many wrapper scripts) is the lack of a simple, public API to obtain the list of valid project ids and bug ids -- we already floated some ideas for addressing this issue. I am OK with leaving the commit-db file in the repo for now and clearly documenting that (1) it is deprecated and (2) will be removed in the future. At the same time, we should offer a clear alternative, which is not use the new file name instead and skip the header. If we want others to migrate away from using commit-db, which is not part of the public API, we should make sure that there is an alternative public API. Iterating over all bug ids should be as simple as |
I agree. We can include the commit-db as well, with a disclaimer that it will be deprecated and removed in the future.
I also agree with this. I was initially thinking about this as an extension of defects4j info, but it might be clumsy to build it on top of that existing command. Perhaps it would be better to have a new command built around exporting info from active/deprecated bugs files:
|
I added another commit making minor tweaks (see @rjust's questions above) and restoring the commit-db files (with the intention that they are deprecated and will later be phased out, and that the framework makes use instead of |
This makes sense. |
Some updates:
|
…J that can be used throughout the framework
This PR fixes #304.
commit-db
has been refactored intoactive-bugs.csv
.active-bugs.csv
noting what each column represents.active-bugs.csv
has been adjusted to handle the header.deprecated-bugs.csv
has been added, containing the entries for all deprecated bugs.deprecated-bugs.csv
utilizes the same header asactive-bugs.csv
, with the addition of the D4J version where the bugs was deprecated and the reason for deprecation.Constants.pm
has been updated with constants representing the column names.