Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow GeoDataset to list files in VSI path(s) #1399

Open
wants to merge 70 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 48 commits
Commits
Show all changes
70 commits
Select commit Hold shift + click to select a range
72702ef
Make RasterDataset accept list of files
Jun 22, 2023
b68dd5a
Fix check if str
adriantre Jun 22, 2023
e90d01c
Use isdir and isfile
adriantre Jun 23, 2023
dfad079
Add kwarg vsi to RasterDataset to support GDAL VSI
adriantre Jun 5, 2023
7291f3e
Fix formatting
adriantre Jun 5, 2023
6b41f18
Add type hints and docstring to method in utils
adriantre Jun 5, 2023
2b2be02
Fix missing import List
adriantre Jun 5, 2023
3f91e97
Fix type hints
adriantre Jun 5, 2023
f0d9475
Refactor with respect to other branch
adriantre Jun 22, 2023
7ca3cb7
Make try-catch more targeted
adriantre Jun 23, 2023
d519061
Remove redundant iglob usage
adriantre Jun 23, 2023
a841fb7
Merge main
adriantre Aug 6, 2024
2ee3c85
Remove unused imports
adriantre Aug 6, 2024
5c6e444
Add wildcard for directories
adriantre Aug 8, 2024
2da9c4a
Allow vsi files to not exist
adriantre Aug 8, 2024
e79061f
Make protected method public
adriantre Aug 8, 2024
00594de
Add zipped dataset to test vsi listdir
adriantre Aug 8, 2024
9ef7669
Merge branch 'main' into feature/support_gdal_virtual_file_systems
adriantre Aug 8, 2024
8a61108
Update fiona version in min-reqs.old
adriantre Aug 8, 2024
e8367ca
Merge branch 'main' into feature/support_gdal_virtual_file_systems
adriantre Aug 12, 2024
014e5f5
Update docstring of list_directory_recursive
adriantre Aug 12, 2024
0bf5d68
Set fiona version in min-reqs.old
adriantre Aug 12, 2024
41e3dfb
Remove redundant path exists
adriantre Aug 12, 2024
56b1c76
Remove duplicated import
adriantre Aug 12, 2024
e96cfa9
Merge branch 'main' into feature/support_gdal_virtual_file_systems
adriantre Aug 12, 2024
d03120d
Create temp archive for test
adriantre Aug 12, 2024
dfc571e
Add docstring to listdir_vsi_recursive
adriantre Aug 12, 2024
aaea729
Change list_directory_recursive return type
adriantre Aug 12, 2024
740dcec
Bump fiona version in pyproject.toml
adriantre Aug 12, 2024
ba014af
Introduce fixture temp_archive for reuse in tests
adriantre Aug 12, 2024
e00b356
Make GeoDataset.files warn if VSI does not exist
adriantre Aug 12, 2024
65ef978
Replace failing test due to new behaviour of vsi
adriantre Aug 12, 2024
5192acd
Fix breaking test due to os.path.join not working on zip parent dir (…
adriantre Aug 12, 2024
9f800f1
Collect test on zip-archive in test class
adriantre Aug 12, 2024
93de375
Update versionadded in dataset/utils.py
adriantre Aug 12, 2024
37c4980
Remove patch from fiona version
adriantre Aug 12, 2024
395b4bf
Add .DS_Store to gitignore
adriantre Aug 13, 2024
6f041ee
Add test for https/curl files
adriantre Aug 13, 2024
dc334d5
Check filname_glob before adding to files property
adriantre Aug 13, 2024
0b80c12
Merge branch 'main' into feature/support_gdal_virtual_file_systems
adriantre Aug 13, 2024
dbbcec7
Define should_warn outside if
adriantre Aug 13, 2024
0cc7e15
Merge branch 'main' into feature/support_gdal_virtual_file_systems
adriantre Aug 14, 2024
c990591
Merge branch 'refs/heads/main' into feature/support_gdal_virtual_file…
adriantre Aug 26, 2024
ccdbe9b
Try to support windows for vsi tests
adriantre Aug 26, 2024
a2dd0d6
Revert windows-specific path format
adriantre Aug 26, 2024
ecce011
Merge branch 'main' into feature/support_gdal_virtual_file_systems
adriantre Aug 26, 2024
08fde98
Skip TestVirtualFilesystems if platform is windows
adriantre Aug 27, 2024
26920ae
Merge branch 'main' into feature/support_gdal_virtual_file_systems
adriantre Aug 27, 2024
3baa587
Remove user-specific ignore from gitignore
adriantre Aug 27, 2024
be142aa
Properly remove changes from .gitignore
adriantre Aug 27, 2024
e7659a9
Merge branch 'main' into feature/support_gdal_virtual_file_systems
adriantre Aug 27, 2024
c88111e
Rename VSI to VFS where appropriate
adriantre Aug 28, 2024
cffd707
Format docstring of listdir_vfs_recursive
adriantre Aug 28, 2024
04740a2
Update torchgeo/datasets/utils.py
adriantre Aug 28, 2024
51a3d64
Update torchgeo/datasets/utils.py
adriantre Aug 28, 2024
7b33550
Don't use os.path.join within VFS
adriantre Aug 28, 2024
b81591d
Update torchgeo/datasets/utils.py
adriantre Aug 28, 2024
c786a57
String format wildcard instead of os.path.join
adriantre Aug 28, 2024
bb3ad78
Document raises
adriantre Aug 28, 2024
0824026
Dont use os.path.join for zip test
adriantre Aug 28, 2024
6a3d128
Update type of error in try except
adriantre Aug 28, 2024
8ea368f
Merge branch 'main' into feature/support_gdal_virtual_file_systems
adriantre Aug 28, 2024
0ad5db2
Simplify tests
adriantre Aug 28, 2024
87ea802
Simplify files property
adriantre Aug 28, 2024
94c3835
Remove unnecessary check in if
adriantre Aug 28, 2024
3ddee3a
Merge branch 'refs/heads/main' into feature/support_gdal_virtual_file…
adriantre Sep 6, 2024
46e2375
Temp archive into tmp_path
adriantre Sep 6, 2024
2d54680
Make utility funcitons non-public
adriantre Sep 6, 2024
d1d8a4a
Fix typo in comment
adriantre Sep 6, 2024
d22a594
Move VFS tests to TestGeoDataset
adriantre Sep 6, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -149,3 +149,6 @@ dmypy.json

# Pyre type checker
.pyre/

# Mac
.DS_Store
adriantre marked this conversation as resolved.
Show resolved Hide resolved
3 changes: 2 additions & 1 deletion pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -39,7 +39,8 @@ dependencies = [
# einops 0.3+ required for einops.repeat
"einops>=0.3",
# fiona 1.8.21+ required for Python 3.10 wheels
"fiona>=1.8.21",
# fiona 1.9+ required for fiona.listdir
"fiona>=1.9",
adriantre marked this conversation as resolved.
Show resolved Hide resolved
# kornia 0.7.3+ required for instance segmentation support in AugmentationSequential
"kornia>=0.7.3",
# lightly 1.4.5+ required for LARS optimizer
Expand Down
2 changes: 1 addition & 1 deletion requirements/min-reqs.old
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ setuptools==61.0.0

# install
einops==0.3.0
fiona==1.8.21
fiona==1.9.0
kornia==0.7.3
lightly==1.4.5
lightning[pytorch-extra]==2.0.0
Expand Down
177 changes: 158 additions & 19 deletions tests/datasets/test_geo.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# Copyright (c) Microsoft Corporation. All rights reserved.
# Licensed under the MIT License.

import math
import os
import pickle
import shutil
import sys
from collections.abc import Iterable
from collections.abc import Generator, Iterable
from pathlib import Path
from typing import Any

Expand All @@ -30,6 +30,7 @@
UnionDataset,
VectorDataset,
)
from torchgeo.datasets.utils import list_directory_recursive


class CustomGeoDataset(GeoDataset):
Expand Down Expand Up @@ -84,6 +85,19 @@ def __len__(self) -> int:
return 2


@pytest.fixture(scope='module')
def temp_archive(request: SubRequest) -> Generator[tuple[str, str], None, None]:
adriantre marked this conversation as resolved.
Show resolved Hide resolved
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably be using the tmp_path fixture here instead of creating the archive in a git-tracked location. This will require the default function scope though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I made a module-scoped fixture instead, but I can opt for class-scoped if it will only be used for TestGeoDataset.

# Runs before tests
dir_not_zipped = request.param
dir_zipped = f'{dir_not_zipped}.zip'
_ = shutil.make_archive(dir_not_zipped, 'zip', dir_not_zipped)
# make_archive returns absolute path, while input may be relative path.
# we opt to return the (relative) path as provided.
yield dir_not_zipped, dir_zipped
# Runs after tests
os.remove(dir_zipped)


class TestGeoDataset:
@pytest.fixture(scope='class')
def dataset(self) -> GeoDataset:
Expand Down Expand Up @@ -178,29 +192,24 @@ def test_files_property_for_non_existing_file_or_dir(self, tmp_path: Path) -> No
with pytest.warns(UserWarning, match='Path was ignored.'):
assert len(CustomGeoDataset(paths=paths).files) == 0

def test_files_property_for_virtual_files(self) -> None:
# Tests only a subset of schemes and combinations.
paths = [
'file://directory/file.tif',
'zip://archive.zip!folder/file.tif',
'az://azure_bucket/prefix/file.tif',
'/vsiaz/azure_bucket/prefix/file.tif',
'zip+az://azure_bucket/prefix/archive.zip!folder_in_archive/file.tif',
'/vsizip//vsiaz/azure_bucket/prefix/archive.zip/folder_in_archive/file.tif',
]
assert len(CustomGeoDataset(paths=paths).files) == len(paths)

def test_files_property_ordered(self) -> None:
def test_files_property_ordered(self, tmp_path: Path) -> None:
"""Ensure that the list of files is ordered."""
paths = ['file://file3.tif', 'file://file1.tif', 'file://file2.tif']

files = ['file3.tif', 'file1.tif', 'file2.tif']
paths = [tmp_path / fake_file for fake_file in files]
for fake_file in paths:
fake_file.touch()
assert CustomGeoDataset(paths=paths).files == sorted(paths)

def test_files_property_deterministic(self) -> None:
def test_files_property_deterministic(self, tmp_path: Path) -> None:
"""Ensure that the list of files is consistent regardless of their original
order.
"""
paths1 = ['file://file3.tif', 'file://file1.tif', 'file://file2.tif']
paths2 = ['file://file2.tif', 'file://file3.tif', 'file://file1.tif']
files = ['file3.tif', 'file1.tif', 'file2.tif']
paths1 = [tmp_path / fake_file for fake_file in files]
paths2 = paths1[::-1] # reverse order
for fake_file in paths1:
fake_file.touch()
assert (
CustomGeoDataset(paths=paths1).files == CustomGeoDataset(paths=paths2).files
)
Expand Down Expand Up @@ -368,6 +377,136 @@ def test_no_all_bands(self) -> None:
CustomSentinelDataset(root, bands=bands, transforms=transforms, cache=cache)


@pytest.mark.skipif(
sys.platform.startswith('win'),
reason='Have not been able to work this out on windows. '
'Try again when fiona version 1.10 arrives.',
)
class TestVirtualFilesystems:
adriantre marked this conversation as resolved.
Show resolved Hide resolved
@pytest.mark.parametrize(
'temp_archive', [os.path.join('tests', 'data', 'vector')], indirect=True
)
def test_zipped_file(self, temp_archive: tuple[str, str]) -> None:
_, dir_zipped = temp_archive
filename = 'vector_2024.geojson'

# Using '!' as separator between archive and file as per Apache Commons VFS
# https://rasterio.readthedocs.io/en/latest/topics/vsi.html
specific_file_zipped = f'{dir_zipped}!{filename}'

files_found = CustomGeoDataset(paths=f'zip://{specific_file_zipped}').files
assert len(files_found) == 1
file = str(files_found[0])
assert file.endswith(filename)

@pytest.mark.parametrize(
'temp_archive',
[
os.path.join(
'tests',
'data',
'sentinel2',
'S2A_MSIL2A_20220414T110751_N0400_R108_T26EMU_20220414T165533.SAFE',
)
],
indirect=True,
)
def test_zipped_specific_file_dir(self, temp_archive: tuple[str, str]) -> None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To simplify the testing drastically, I think I would rather:

  1. Create all uncompressed (.tif) and compressed (.zip) files in tests/data/<whatever> and store them in git so we don't have to generate and delete them on the fly
  2. Have only 2 test functions (expected to pass, expected to fail)
  3. Parametrize each of these test functions with all supported inputs (.tif, zipped file, zipped dir, etc.)

This will remove a lot of the code duplication and make things easier to add new tests for (just a single line in parametrize). What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look! I like the idea.

Might be annoying to keep the archives up-to-date with the source if we ever modify them though. The tests does not seem to be much slower when archiving on the fly, but it would reduce code-complexity for the test

dir_not_zipped, dir_zipped = temp_archive

# Listing, instead of hardcoding filepath
filename_glob = Sentinel2.filename_glob.format('B02')
specific_file_not_zipped = list_directory_recursive(
dir_not_zipped, filename_glob
)[0]
filepath_within_dir = specific_file_not_zipped.replace(dir_not_zipped, '')

files_found = CustomGeoDataset(
paths=f'zip://{dir_zipped}!{filepath_within_dir}'
).files
assert len(files_found) == 1
file = str(files_found[0])
assert file.endswith(filepath_within_dir)

@pytest.mark.parametrize(
'temp_archive',
[
os.path.join(
'tests',
'data',
'sentinel2',
'S2A_MSIL2A_20220414T110751_N0400_R108_T26EMU_20220414T165533.SAFE',
)
],
indirect=True,
)
def test_zipped_path_non_existing_file(self, temp_archive: tuple[str, str]) -> None:
dir_not_zipped, dir_zipped = temp_archive
file_zipped_not_existing = os.path.join(dir_zipped, 'non_existing_file.tif')
with pytest.warns(UserWarning, match='Path was ignored.'):
assert (
len(CustomGeoDataset(paths=f'zip://{file_zipped_not_existing}').files)
== 0
)

@pytest.mark.parametrize(
'temp_archive',
[
os.path.join(
'tests',
'data',
'sentinel2',
'S2A_MSIL2A_20220414T110751_N0400_R108_T26EMU_20220414T165533.SAFE',
)
],
indirect=True,
)
def test_zipped_directory(self, temp_archive: tuple[str, str]) -> None:
dir_not_zipped, dir_zipped = temp_archive
bands = Sentinel2.rgb_bands
transforms = nn.Identity()
cache = False

files_not_zipped = Sentinel2(
paths=dir_not_zipped, bands=bands, transforms=transforms, cache=cache
).files

files_zipped = Sentinel2(
paths=f'zip://{dir_zipped}', bands=bands, transforms=transforms, cache=cache
).files

basenames_not_zipped = [Path(path).stem for path in files_not_zipped]
basenames_zipped = [Path(path).stem for path in files_zipped]
assert basenames_zipped == basenames_not_zipped

def test_http_files(self) -> None:
adriantre marked this conversation as resolved.
Show resolved Hide resolved
url = 'https://github.com/microsoft/torchgeo/raw/main'
dir = os.path.join(
'tests',
'data',
'sentinel2',
'S2A_MSIL2A_20220414T110751_N0400_R108_T26EMU_20220414T165533.SAFE',
'GRANULE',
'L2A_T26EMU_A035569_20220414T110747',
'IMG_DATA',
'R10m',
)
files = [
'T26EMU_20220414T110751_B04_10m.jp2',
'T26EMU_20190414T110751_B03_10m.jp2',
]
paths = [os.path.join(url, dir, file) for file in files]
bands = Sentinel2.rgb_bands
transforms = nn.Identity()
cache = False

files_found = Sentinel2(
paths=paths, bands=bands, transforms=transforms, cache=cache
).files

assert len(files_found) == 1


class TestVectorDataset:
@pytest.fixture(scope='class')
def dataset(self) -> CustomVectorDataset:
Expand Down
21 changes: 11 additions & 10 deletions torchgeo/datasets/geo.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,7 +6,6 @@
import abc
import fnmatch
import functools
import glob
import os
import pathlib
import re
Expand Down Expand Up @@ -40,8 +39,8 @@
array_to_tensor,
concat_samples,
disambiguate_timestamp,
list_directory_recursive,
merge_samples,
path_is_vsi,
)


Expand Down Expand Up @@ -308,14 +307,16 @@ def files(self) -> list[Path]:
# Using set to remove any duplicates if directories are overlapping
files: set[Path] = set()
for path in paths:
if os.path.isdir(path):
pathname = os.path.join(path, '**', self.filename_glob)
files |= set(glob.iglob(pathname, recursive=True))
elif (os.path.isfile(path) or path_is_vsi(path)) and fnmatch.fnmatch(
str(path), f'*{self.filename_glob}'
):
files.add(path)
elif not hasattr(self, 'download'):
file_not_found = True
adriantre marked this conversation as resolved.
Show resolved Hide resolved
if os.path.isfile(path):
if fnmatch.fnmatch(str(path), os.path.join('*', self.filename_glob)):
files.add(path)
file_not_found = False
elif files_found := set(list_directory_recursive(path, self.filename_glob)):
files |= files_found
file_not_found = False

if file_not_found and not hasattr(self, 'download'):
warnings.warn(
f"Could not find any relevant files for provided path '{path}'. "
f'Path was ignored.',
adriantre marked this conversation as resolved.
Show resolved Hide resolved
Expand Down
72 changes: 72 additions & 0 deletions torchgeo/datasets/utils.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@

import collections
import contextlib
import fnmatch
import glob
import importlib
import os
import pathlib
Expand All @@ -19,9 +21,11 @@
from datetime import datetime, timedelta
from typing import Any, TypeAlias, cast, overload

import fiona
import numpy as np
import rasterio
import torch
from fiona.errors import FionaValueError
from torch import Tensor
from torchvision.datasets.utils import (
check_integrity,
Expand Down Expand Up @@ -631,6 +635,74 @@ def path_is_vsi(path: Path) -> bool:
return '://' in str(path) or str(path).startswith('/vsi')


def listdir_vsi_recursive(root: Path) -> list[str]:
"""Lists all files in Virtual File Systems (VSI) recursively.
adriantre marked this conversation as resolved.
Show resolved Hide resolved

Args:
root: directory to list. These must contain the prefix for the VSI
e.g. '/vsiaz/' or 'az://' for azure blob storage
'/vsizip/' or 'zip://' for zipped archives
adriantre marked this conversation as resolved.
Show resolved Hide resolved

Returns:
A list of all file paths matching filename_glob in the root VSI directory or its
subdirectories.

adriantre marked this conversation as resolved.
Show resolved Hide resolved
.. versionadded:: #0.6
adriantre marked this conversation as resolved.
Show resolved Hide resolved
"""
dirs = [str(root)]
files = []
while dirs:
dir = dirs.pop()
try:
subdirs = fiona.listdir(dir)
adriantre marked this conversation as resolved.
Show resolved Hide resolved
dirs.extend([os.path.join(dir, subdir) for subdir in subdirs])
except FionaValueError as e:
if 'is not a directory' in str(e):
# Assuming dir is a file as it is not a directory
# fiona.listdir can throw FionaValueError for only two reasons
files.append(dir)
else:
# fiona.listdir can throw FionaValueError for only two reasons
# 1. 'is not a directory'
# 2. 'does not exist'
raise
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should be more specific about what we raise here. Is this try-except really the only way to do this? Seems dirty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would checking the error message and raising our own error be better?

Here is the error that might occur:
https://github.com/Toblerity/Fiona/blob/7490643d145ce0be506a210abd2802fd0fff63f4/fiona/ogrext.pyx#L2171

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If Fiona accepts this, would this better? (Would still need some workaround until then).

if not fiona.dir_exists(dir):
    continue
elif fiona.isdir(dir):
    subdirs = fiona.listdir(dir)
    dirs.extend([f'{dir}/{subdir}' for subdir in subdirs])
else:
    files.append(dir)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So the feedback from fiona is that "torchgeo use a dedicated package like fsspec or the Azure Python module." But then we don't get the silver bullet for all of them, and it seems like additional libraries for each file system will have to be installed.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're definitely trying to minimize our dependencies when possible (0.6 actually removes more dependencies than it adds) so I would prefer not to do this. Worst case, users can always use these libraries themselves to get a list of files, then pass these to a GeoDataset (like in 0.5).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So what do you think, merge this, or go with my first proposal; make it easier to handle "non-local-files" by adding a method that the users can override?

if os.path.isdir(path):
    pathname = os.path.join(path, '**', self.filename_glob)
    files |= set(glob.iglob(pathname, recursive=True))
elif (os.path.isfile(path) or path_is_vsi(path)) and fnmatch.fnmatch(
    str(path), f'*{self.filename_glob}'
):
    files.add(path)
elif not hasattr(self, 'download'):
    self.handle_non_local_file()
    
def handle_non_local_file(self):
    warnings.warn(
        f"Could not find any relevant files for provided path '{path}'. "
        f'Path was ignored.',
        UserWarning,
    )

return files


def list_directory_recursive(root: Path, filename_glob: str) -> list[str]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we return a set instead of a list? The files method is the only place they are used and uses a set.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thats ok by me. The only reason I did not is because similar methods returns a list (iglob, fnmatch etc).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could change this method to remove duplicates (set) and sort it, then return a list as well. Or is it better to be explicit about that in the files-property?

"""Lists files in directory recursively matching the given glob expression.

Also supports GDAL Virtual File Systems (VSI).

Args:
root: directory to list. For VSI these will have prefix
e.g. /vsiaz/ or az:// for azure blob storage
filename_glob: filename pattern to filter filenames
adriantre marked this conversation as resolved.
Show resolved Hide resolved

Returns:
A list of all file paths matching filename_glob in the root directory or its
subdirectories.

.. versionadded:: #0.6
adriantre marked this conversation as resolved.
Show resolved Hide resolved
"""
files: list[str]
if path_is_vsi(root):
# Change type to match expected input to filter
all_files: list[str] = []
try:
all_files = listdir_vsi_recursive(root)
except FionaValueError:
# To match the behaviour of glob.iglob we silently return empty list
# for non-existing root.
pass
# Prefix glob with wildcard to ignore directories
files = fnmatch.filter(all_files, os.path.join('*', filename_glob))
adamjstewart marked this conversation as resolved.
Show resolved Hide resolved
else:
pathname = os.path.join(root, '**', filename_glob)
files = list(glob.iglob(pathname, recursive=True))
adriantre marked this conversation as resolved.
Show resolved Hide resolved
return files


def array_to_tensor(array: np.typing.NDArray[Any]) -> Tensor:
"""Converts a :class:`numpy.ndarray` to :class:`torch.Tensor`.

Expand Down