Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Prefixfix #32

Open
wants to merge 3 commits into
base: missingfix
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions fastdupes.py
Original file line number Diff line number Diff line change
Expand Up @@ -70,6 +70,7 @@

if sys.version_info.major >= 3:
basestring = str # pylint: disable=redefined-builtin,invalid-name
raw_input = input # pylint: disable=redefined-builtin,invalid-name

def multiglob_compile(globs, prefix=False):
"""Generate a single "A or B or C" regex from a list of shell globs.
Expand Down Expand Up @@ -127,7 +128,7 @@ def hashFile(handle, want_hex=False, limit=None, chunk_size=CHUNK_SIZE):
for block in iter(lambda: handle.read(chunk_size), b''):
fhash.update(block)
read += chunk_size
if 0 < limit <= read:
if limit and limit <= read:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind explaining why you made this no longer ensure that limit is positive?

break

if should_close:
Expand Down Expand Up @@ -331,7 +332,7 @@ def sizeClassifier(path, min_size=DEFAULTS['min_size']):
return filestat.st_size

@groupify
def hashClassifier(path, limit=HEAD_SIZE):
def hashClassifier(path, limit=None):
"""Sort a file into a group based on its SHA1 hash.

:param paths: See :func:`fastdupes.groupify`
Expand Down
12 changes: 12 additions & 0 deletions test_fastdupes.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
#execute with pytest
Copy link
Owner

@ssokolow ssokolow Aug 20, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's a bit rude to just unilaterally decide that the tests will depend on pytest-specific features. (And, in fact, had you submitted this a year ago, I'd have asked you to rewrite it to be compatible with Nose.)

That said, the API presented is quite appealing and I was already considering a move to pytest so I'm not going to rewrite it to the "Python stdlib only" subset I often aim for.


from fastdupes import find_dupes

def test_common_prefix(tmpdir):
files = tmpdir.mkdir("files")
file1 = files.join("file1")
file2 = files.join("file2")
file1.write("0"*1000000 + "1")
file2.write("0"*1000000 + "2")
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd probably be better to either assert or use max() to ensure that this test will always use a value greater than HEAD_SIZE.

groups = find_dupes([str(files)])
assert len(groups) == 0