Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implements #304: Adds deprecated-bugs.csv and refactors commit-db into active-bugs.csv #312

Merged
merged 44 commits into from
Jul 3, 2020
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
a4242ce
Creating initial deprecated-bugs.csv files and adding constants for c…
Greg4cr Mar 11, 2020
644e41f
Adding deprecated entries for Cli and Collections, updating README
Greg4cr Mar 11, 2020
06e2046
Bug mining creates empty deprecated-bugs.csv file
Greg4cr Mar 11, 2020
ee8c71b
Adds a header to the commit-db, and modifies framework to handle the …
Greg4cr Mar 11, 2020
1c440ce
Refactors commit-db into active-bugs.csv
Greg4cr Mar 11, 2020
c032413
Fixed a typo and improved README.
rjust Mar 11, 2020
fcd9c99
Adding constant for filenames, removing direct file references
Greg4cr Mar 12, 2020
77b201a
Renaming column names to match stlye and usage in other parts of D4J
Greg4cr Mar 12, 2020
1125cd5
Merge branch 'master' into bugs-csv
Greg4cr Mar 12, 2020
2b501c3
Refactoring test_export_command to use the get_bug_ids method
Greg4cr Mar 12, 2020
bad5058
Merge branch 'master' into bugs-csv
Greg4cr Mar 13, 2020
806f152
Minor tweaks and restoring commit-db files
Greg4cr Mar 13, 2020
c268f32
Minor tweaks and restoring commit-db files
Greg4cr Mar 13, 2020
215be62
Missing values in Chart CSV filled in
Greg4cr Mar 13, 2020
bf421ac
Merging in changes from master
Greg4cr Mar 16, 2020
9b26f7e
Fixing commit-db reference created in merge
Greg4cr Mar 16, 2020
dbb98c0
Cleaning documentation on Vcs.pm
Greg4cr Mar 16, 2020
f246d2b
Adding constant for dir-layout.csv
Greg4cr Mar 16, 2020
17d9623
Merge branch 'master' into bugs-csv
Greg4cr Mar 17, 2020
8ff5018
Adding WIP for d4j-query
Greg4cr Mar 17, 2020
899eca8
Adds d4j-query and test cases for it
Greg4cr Mar 18, 2020
afc6b73
Fixing a typo in d4j-query documentation
Greg4cr Mar 18, 2020
e35fdce
Fixing a typo in d4j-query documentation
Greg4cr Mar 18, 2020
e64f4b9
One more typo in d4j-query documentation
Greg4cr Mar 18, 2020
889bd08
Merge branch 'master' into bugs-csv
Greg4cr Apr 21, 2020
8b62fd8
Merge branch 'master' into bugs-csv
Greg4cr Apr 28, 2020
49973e0
Merge branch 'master' into bugs-csv
Greg4cr Apr 29, 2020
6554bcb
Adding README for query and updating field names to better match d4j-…
Greg4cr Apr 29, 2020
eec8eda
Adds d4j-bugs shortcut command
Greg4cr Apr 29, 2020
4a7e3cc
Utils correctly handles header in active-bugs.csv now
Greg4cr Apr 29, 2020
847f719
d4j-query now offers all information surfaced in d4j-info
Greg4cr Apr 30, 2020
7077628
Clarifying README
Greg4cr Apr 30, 2020
bf7e8f3
Clarifying README further
Greg4cr Apr 30, 2020
1f1fac8
Refactoring variable names to be consistent
Greg4cr Jun 29, 2020
3b261ee
Capitalizing flags without arguments, clarifying documentation, renam…
Greg4cr Jun 29, 2020
6a377f4
Renames bugs to print-bugs
Greg4cr Jun 29, 2020
ed041f5
Refactors d4j-query functionality into a separate API within Defects4…
Greg4cr Jun 29, 2020
f48f2e9
Merging master into bugs-csv
Greg4cr Jul 1, 2020
a626c14
Renaming print-bugs to bids following conversation with @rjust
Greg4cr Jul 1, 2020
ef3c486
Consistent terminology in README
Greg4cr Jul 1, 2020
e7e8b71
Fixing typo in README
Greg4cr Jul 1, 2020
0793afb
Minor documentation and field name adjustments
Greg4cr Jul 2, 2020
5e5b9a9
Adjustment to d4j-query test output
Greg4cr Jul 2, 2020
ac0fb25
Merging in master
Greg4cr Jul 2, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 57 additions & 5 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,18 @@ Defects4J contains 835 bugs from the following open-source projects:
| Time | joda-time | 26 | 1-20,22-27 | 21 |

\* Due to behavioral changes introduced under Java 8, some bugs are no longer
reproducible. These bugs have been removed from the commit-db, but their
metadata is still available in the project directory. As publications using
Defects4J artifacts refer to bugs by their specific bug id, we do not re-number
active bug ids of existing bugs.
reproducible. Hence, Defects4J distinguishes between active and deprecated bugs:

- Active bugs can be accessed through `active-bugs.csv`.

- Deprecated bugs are removed from `active-bugs.csv`, but their metadata is
retained in the project directory.

- Deprecated bugs can be accessed through `deprecated-bugs.csv`, which also
details when and why a bug was deprecated.

We do not re-enumerate active bugs because publications using Defects4J artifacts
usually refer to bugs by their specific bug id.

The bugs
---------------
Expand Down Expand Up @@ -137,8 +145,10 @@ Use [`framework/bin/defects4j`](http://defects4j.org/html_doc/defects4j.html) to
| [mutation](http://defects4j.org/html_doc/d4j/d4j-mutation.html) | Run mutation analysis on a buggy or a fixed project version |
| [coverage](http://defects4j.org/html_doc/d4j/d4j-coverage.html) | Run code coverage analysis on a buggy or a fixed project version |
| [monitor.test](http://defects4j.org/html_doc/d4j/d4j-monitor.test.html) | Monitor the class loader during the execution of a single test or a test suite |
| [export](http://defects4j.org/html_doc/d4j/d4j-export.html) | Export version-specific properties such as classpaths, directories, or lists of tests |
| [bids](http://defects4j.org/html_doc/d4j/d4j-bids.html) | Print the list of active or deprecated bug IDs for a specific project |
| [pids](http://defects4j.org/html_doc/d4j/d4j-pids.html) | Print a list of available project IDs |
| [export](http://defects4j.org/html_doc/d4j/d4j-export.html) | Export version-specific properties such as classpaths, directories, or lists of tests |
| [query](http://defects4j.org/html_doc/d4j/d4j-query.html) | Query the metadata to generate a CSV file of requested information for a specific project |

Export version-specific properties
----------------------------------
Expand All @@ -159,6 +169,48 @@ directory to export a version-specific property:
| tests.relevant | List of relevant tests classes (a test class is relevant if, when executed, the JVM loads at least one of the modified classes) |
| tests.trigger | List of test methods that trigger (expose) the bug |

Export project-specific metadata
--------------------------------
Use `defects4j query -p <pid> -q <field_list> [-o <output_file>] [-D|-A]`
to generate a CSV file containing a set of requested metadata for each bug
in a specific project.

By default, `defects4j query` returns a list of active bug IDs for a project.
To request specific metadata, the `-q` flag should be provided with a
comma-separated list of variables from the list below. For example,
`defects4j query -p Chart -q "report.id,report.url"` will provide the a list of
all active bug IDs, along with the bug report ID and bug report URL for each.


| Property | Description |
|-----------------------|-------------------------------------------------------------------------------------|
| bug.id | Assigned bug IDs (included in all results) |
| project.id | Assigned project ID |
| project.name | Original project name |
| project.build.file | Location of the Defects4J build file for the project |
| project.vcs | Version control system used by the project |
| project.repository | Location of the project repository |
| project.bugs.csv | Location of the CSV containing information on that bug |
| revision.id.buggy | Commit hashes for the buggy version of each bug |
| revision.id.fixed | Commit hashes for the fixed version of each bug |
| revision.date.buggy | Date of the buggy commit for each bug |
| revision.date.fixed | Date of the fixed commit for each bug |
| report.id | Bug report ID from the version tracker for each bug |
| report.url | Bug report URL from the version tracker for each bug |
| classes.modified | Classes modified by the bug fix |
| classes.relevant.src | Source classes loaded by the JVM when executing all triggering tests |
| classes.relevant.test | Test classes loaded by the JVM when executing all triggering tests |
| tests.relevant | List of relevant tests classes (a test class is relevant if, when executed, the JVM loads at least one of the modified classes) |
| tests.trigger | List of test methods that trigger (expose) the bug |
| tests.trigger.cause | List of test methods that trigger (expose) the bug, along with the root cause |
| deprecated.version | (for deprecated bugs only) Version of Defects4J where a bug was deprecated |
| deprecated.reason | (for deprecated bugs only) Reason for deprecation |

By default, `defects4j query` returns information on active bugs. The `[-D]`
flag returns information only on deprecated bugs, while the `[-A]` flag returns
information for all active and deprecated bugs.


Test execution framework
--------------------------
The test execution framework for generated test suites (`framework/bin`)
Expand Down
105 changes: 105 additions & 0 deletions framework/bin/d4j/d4j-bids
Original file line number Diff line number Diff line change
@@ -0,0 +1,105 @@
#-------------------------------------------------------------------------------
# Copyright (c) 2014-2019 René Just, Darioush Jalali, and Defects4J contributors.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#-------------------------------------------------------------------------------

=pod

=head1 NAME

d4j-bids -- Lists all bug IDs for a project

=head1 SYNOPSIS

d4j-bids -p pid [-D|-A]

=head1 DESCRIPTION

This script is a high level shortcut intended to give developers access
to a list of bug IDs for a chosen project. By default, the list of
available active bug IDs is returned.

=head1 OPTIONS

=over 4

=item -p C<pid>

The ID of the project for which metadata is requested. A project ID
must be provided to use this utility.

=item -D

Include only deprecated bugs. By default, only active bugs are queried.
Cannot be used in conjunction with "all bugs" (-A).

=item -A

Include both active and deprecated bugs. By default, only active bugs are
queried. Cannot be used in conjunction with "only deprecated bugs" (-D).

=back

=cut

use strict;
use warnings;

use Constants;
use Getopt::Std;
use Query;

#
# Issue usage message and quit
#
sub _usage {
print "usage: $0 -p pid [-D|-A]\n";
exit 1;
}

# Process command line options
my %cmd_opts;
getopts('p:DA', \%cmd_opts) or _usage();

_usage() unless defined $cmd_opts{p};

my $PID = $cmd_opts{p};
my $ONLY_DEP = defined $cmd_opts{D} ? 1 : 0;
my $ALL_BUGS = defined $cmd_opts{A} ? 1 : 0;

if ($ONLY_DEP and $ALL_BUGS) {
die "Only deprecated bugs (-D) and all bugs (-A) cannot be concurrently set.";
}

my %results;
my @requested = $BUGS_CSV_BUGID;
if ($ONLY_DEP) {
%results = Query::query_metadata($PID, "D", @requested);
} elsif ($ALL_BUGS) {
%results = Query::query_metadata($PID, "A", @requested);
} else{
%results = Query::query_metadata($PID, "C", @requested);
}

foreach my $bug_id (sort { $a <=> $b } keys %results) {
print "$bug_id\n";
}

1;
181 changes: 181 additions & 0 deletions framework/bin/d4j/d4j-query
Original file line number Diff line number Diff line change
@@ -0,0 +1,181 @@
#-------------------------------------------------------------------------------
# Copyright (c) 2014-2019 René Just, Darioush Jalali, and Defects4J contributors.
#
# Permission is hereby granted, free of charge, to any person obtaining a copy
# of this software and associated documentation files (the "Software"), to deal
# in the Software without restriction, including without limitation the rights
# to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
# copies of the Software, and to permit persons to whom the Software is
# furnished to do so, subject to the following conditions:
#
# The above copyright notice and this permission notice shall be included in
# all copies or substantial portions of the Software.
#
# THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
# IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
# FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
# AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
# LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
# OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
# THE SOFTWARE.
#-------------------------------------------------------------------------------

=pod

=head1 NAME

d4j-query -- query the metadata for a project to obtain CSV-formatted results.

=head1 SYNOPSIS

d4j-query -p pid [-q query] [-o output_file] [-h] [-D|-A]

=head1 DESCRIPTION

This script is intended to query the metadata for a project to obtain
information that would be useful for automation or analysis of the framework.
A user-provided list of columns determines what data is returned. By default,
the list of available active bug IDs is returned.

=head1 OPTIONS

=over 4

=item -p C<pid>

The ID of the project for which metadata is requested. A project ID
must be provided to use this utility.

=item -q C<query>

A comma-separated list of fields, encased in quotation marks. For example,
C<-q "bug.id,report.id"> returns the list of bug IDs and issue tracker IDs
for the requested system.

=item -o C<output_file>

A file to output the extracted CSV to. By default, prints to the screen.

=item -h

List the available fields.

=item -D

Include only deprecated bugs. By default, only active bugs are queried.
Cannot be used in conjunction with "all bugs" (-A).

=item -A

Include both active and deprecated bugs. By default, only active bugs are
queried. Cannot be used in conjunction with "only deprecated bugs" (-D).

=head1 EXAMPLES

=item -C<d4j-query -p Collections>

Returns the list of active bug IDs for project Collections.

=item -C<d4j-query -p Collections -h>

Returns the list of available fields that can be queried.

=item -C<d4j-query -p Collections -q "revision.buggy,classes.modified">

Returns all active bug IDs, and for each, the revision hash of the buggy
version and the list of modified classes.

=item -C<d4j-query -p Collections -q "revision.buggy,classes.modified" -D>

Returns all deprecated bug IDs, and for each, the revision hash of the buggy
version and the list of modified classes.

=item -C<d4j-query -p Collections -q "revision.buggy,classes.modified" -A>

Returns all bug IDs (active and deprecated), and for each, the revision hash
of the buggy version and the list of modified classes.

=item -C<d4j-query -p Collections -q "deprecated.reason" -A>

Returns all bug IDs (active and deprecated) along with the reason for
deprecation. For active bugs, the deprecation reason will be "NA", as
those bugs do not have values for that field.

=back

=cut

use strict;
use warnings;

use Constants;
use Getopt::Std;
use Query;

#
# Issue usage message and quit
#
sub _usage {
print "usage: $0 -p project_id [-q query] [-o output_file] [-H] [-D|-A]\n";
exit 1;
}

# Process command line options
my %cmd_opts;
getopts('p:q:o:HDA', \%cmd_opts) or _usage();

_usage() unless defined $cmd_opts{p};

my $PID = $cmd_opts{p};
my $QUERY = defined $cmd_opts{q} ? $cmd_opts{q} : $BUGS_CSV_BUGID;
my @requested = split /,/, $QUERY or die "Unable to parse query: $QUERY";
my $OUTPUT_FILE = defined $cmd_opts{o} ? $cmd_opts{o} : "none";
my $ONLY_DEP = defined $cmd_opts{D} ? 1 : 0;
my $ALL_BUGS = defined $cmd_opts{A} ? 1 : 0;

if (defined $cmd_opts{H}) {
my $joined_fields = join(", ", Query::get_fields());
print "Available fields: $joined_fields\n";
exit 1;
}

if ($ONLY_DEP and $ALL_BUGS) {
die "Only deprecated bugs (-D) and all bugs (-A) cannot be concurrently set.";
}

my %results;
if ($ONLY_DEP) {
%results = Query::query_metadata($PID, "D", @requested);
} elsif ($ALL_BUGS) {
%results = Query::query_metadata($PID, "A", @requested);
} else{
%results = Query::query_metadata($PID, "C", @requested);
}

# Print the results in CSV format

my $file;

if ($OUTPUT_FILE ne "none") {
open($file, '>', $OUTPUT_FILE) or die "Could not open file '$OUTPUT_FILE' $!";
}

foreach my $bug_id (sort { $a <=> $b } keys %results) {
my $output = $bug_id;
foreach my $field (@requested) {
if ($field ne $BUGS_CSV_BUGID) {
$output = $output.",".$results{$bug_id}{$field};
}
}
if ($OUTPUT_FILE eq "none") {
print "$output\n";
} else {
print $file "$output\n";
}
}

if ($OUTPUT_FILE ne "none") {
close $file;
}

1;
18 changes: 18 additions & 0 deletions framework/bin/defects4j
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,24 @@ prints a list of all available project ids.
=cut
$cmd_descr{export}="export a version-specific property";

=pod

=item * L<B<query>|d4j::d4j-query/>

query the metadata for a particular project for automation purposes.

=cut
$cmd_descr{query}="query the metadata for a particular project for automation purposes";

=pod

=item * L<B<bids>|d4j::d4j-bids/>

print all active bug IDs for a specific project.

=cut
$cmd_descr{bids}="print all active bug IDs for a specific project";

#
# Issue usage message and quit
#
Expand Down
Loading