-
Notifications
You must be signed in to change notification settings - Fork 347
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CDL: cannot redownload additional years #2007
Comments
I'm unable to reproduce this issue. I tried both 2017 and 2022 and both downloaded fine on my system. What version of torchvision are you using? Can you try upgrading to the newest version? |
Is it possible that you already have some CDL data somewhere in that folder recursively? |
Your screenshot doesn't contain the full stack trace, and I also can't copy-n-paste error messages from screenshots... |
|
Never seen this error before, interesting... We still need to figure out how to reproduce this. Are you able to reproduce this in Google Colab or some other shared computing resource I can access? That will make it easier to debug. |
If you create an account on https://lightning.ai/ I can grant you access! |
One thing I am noticing is that the bounds shown in the output of your
Is there anything else in the |
I cannot reproduce the issue either. The dataset can be downloaded immediately. I did find that the other years can't be downloaded after downloading some years. For example:
This can download the corresponding year without issues. But if I restart the terminal and run
It won't download anything. It seems that the download function only works for the first time when the data directory doesn't have any downloaded CDL files. This issue is not related to certain years. I tried different combination of years. |
One bug here is that if I do: and the
the second download of the 2023 layer does not happen. Edit: It seems @yichiac and I discovered this at the same time 🙂 in
|
The problem is actually higher up: # Check if the extracted files already exist
if self.files:
return If any CDL files are found, the method exits, even if the specific years you requested aren't there. This broke in #1442. The fix would be to check for the specific years requested. However, this is difficult if you can't know whether |
Yes, just discovered this as well |
I found if I run the command in terminal (rather than jupyter) I get a warning - I pointed to a fresh directory (data2):
Appears it is ignoring the path and hanging. If I interrupt and rerun the command, I do not get the warning.
|
Hey, I can reproduce the same issue in a Studio on Lightning.Ai. The hanging seems to be coming from torchvision: Here is a minimal repro. import urllib
import urllib.error
import urllib.request
USER_AGENT = "pytorch/vision"
def _get_redirect_url(url: str, max_hops: int = 3) -> str:
initial_url = url
headers = {"Method": "HEAD", "User-Agent": USER_AGENT}
for _ in range(max_hops + 1):
with urllib.request.urlopen(urllib.request.Request(url, headers=headers)) as response:
if response.url == url or response.url is None:
return url
url = response.url
else:
raise RecursionError(
f"Request to {initial_url} exceeded {max_hops} redirects. The last redirect points to {url}."
)
url = "https://www.nass.usda.gov/Research_and_Science/Cropland/Release/datasets/2022_30m_cdls.zip"
url = _get_redirect_url(url)
assert url == url
print(url) |
A temporary workaround on lightning.ai thanks to @tchaton from torchgeo.datasets import CDL
# Apply patch to pop User-Agent until we figure out why it hangs
from torchvision.datasets.utils import urllib
original_request = urllib.request.Request
def Request(*args, headers, **kwargs):
if "User-Agent" in headers:
headers.pop("User-Agent")
return original_request(*args, headers=headers, **kwargs)
urllib.request.Request = Request
dataset = CDL(years=[2022], download=True, paths="./data")
print(dataset) However when I go to plot a sample I get the error
I suspect this error is due to setting a crs that is different from the native dataset crs, as when I don't do this there is no error |
Description
Data should be downloading, over an hour in nothing has happened
Steps to reproduce
Version
0.6.0.dev0
The text was updated successfully, but these errors were encountered: