Skip to content

Commit

Permalink
minor fix according to suggestions by zhijian and yilun
Browse files Browse the repository at this point in the history
  • Loading branch information
yxdyc committed Nov 15, 2023
1 parent f355d00 commit b803ae3
Show file tree
Hide file tree
Showing 2 changed files with 1 addition and 3 deletions.
2 changes: 1 addition & 1 deletion configs/config_all.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -122,7 +122,7 @@ process:
any_or_all: any # keep this sample when any/all images meet the filter condition
- image_size_filter: # filter samples according to the size of images (in bytes) within them
min_size: "0" # the min size of filter range
max_ratio: "1TB" # the max size of filter range
max_size: "1TB" # the max size of filter range
any_or_all: any # keep this sample when any/all images meet the filter condition
- language_id_score_filter: # filter text in specific language with language scores larger than a specific max value
lang: en # keep text in what language
Expand Down
2 changes: 0 additions & 2 deletions data_juicer/ops/filter/image_size_filter.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,11 +4,9 @@
from data_juicer.utils.mm_utils import get_image_size, size_to_bytes

from ..base_op import OPERATORS, Filter
from ..op_fusion import LOADED_IMAGES


@OPERATORS.register_module('image_size_filter')
@LOADED_IMAGES.register_module('image_size_filter')
class ImageSizeFilter(Filter):
"""Keep data samples whose image size (in bytes/kb/MB/...) within a
specific range.
Expand Down

0 comments on commit b803ae3

Please sign in to comment.