-
Notifications
You must be signed in to change notification settings - Fork 33
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Reproducing mmseg/ade20k results #73
Comments
Hello, for RADIO the crop size was set to 518 as this was the nearest multiple of the RADIOv1 patch size of 14. |
Yes, shortly after posting that I noticed the RadioV1 model had the different patch size. Maybe the better approach here is to choose a fixed crop size for all models, e.g. (512, 512) then use the |
Btw, have you seen that we just released RADIOv2.5 ViT-B and L models? If not, check out the tech report. They may very well work better for your use case. |
@mranzinger Yes I did see that, very excited to try them out, thanks! |
@mranzinger Is there a way to specify what version I want from HF? Specifically, what do I pass as the
Or do I have to use torchhub to specify these versions (and therefore modify Edit: Sounds like I might be able to specify the |
@gheinrich is the expert on the HFHub code. But I believe you want https://huggingface.co/nvidia/RADIO-L, so yes, I think you're right with your I may be ignorant about all of the features of HFHub, but I will say that it's been easier to deal with model versions and releases using TorchHub. |
Okay great, thanks. I'll give this a shot and let you know if it works. |
The mapping below appears to be correct (the same one discussed above). It would be nice if this is in HF_RESOURCE_MAP = { # Version key in RESOURCE_MAP to HuggingFace repo id
# RADIOv2.5
"radio_v2.5-b": "NVIDIA/RADIO-B",
"radio_v2.5-l": "NVIDIA/RADIO-L",
# RADIO
"radio_v2.1": "NVIDIA/RADIO",
"radio_v2": None,
"radio_v1": None,
# E-RADIO
"e-radio_v2": "NVIDIA/E-RADIO",
} However, it looks like the HF vs. torchhub models for RADIO v2.1 and E-RADIO v2.0 have some slight differences. Not sure if this is expected or not, or if it's a big deal for anyone and worth fixing, but it's failing some tests that I have (as well as For RADIO v2.1, the HF model at For E-RADIO v2.0, the HF model at I'm not sure if these things were this way before the v2.5 push either, I wasn't testing things in this way before. |
The How are you using the For E-RADIO, how different are your ADE20k results versus our reported? |
For the torchub models I'm directly calling I'm using the My best results on ADE20K for E-RADIO were 22.7 mIoU. I don't have 8 GPUs so I'm using 2 GPUs w/ BS=8. I tried forcing |
I forgot, I made two other changes to the E-RADIO config. Using |
Hi Greg, very interesting, thanks for sharing! I'll have to look into the "slide" implementation in more detail, I just assumed it would crop/stitch the output to give exactly the same results. I'm not sure why it was causing issues for me, after I figured out that "whole" fixed the issue I just moved on. For all of the results I've discussed so far, I'm using the exact same mmseg setup as in this codebase. All I've changed is the config file I posted above to fix the issue, and switched to using 2x GPUs with a batch size of 8. I've attached the config I get from the following hf_model = AutoModel.from_pretrained("NVIDIA/E-RADIO", trust_remote_code=True)
print(pprint.pformat(hf_model .config, sort_dicts=False)) It looks like the config has I'll try training some models with my torchhub-based API while passing in vitdet_window_size=16 and see if that fixes it. While I'm doing that, is it possible for you or Mike to try to fine-tune E-RADIO and see if it works with the current codebase? If you don't have time no worries, I'll try to get it working for a bit longer and then use RADIO v2.5-B instead. |
Sorry, I see what's happening here with One more hint as to what might be going on. When calling Edit: The issue only appears to be during inference/validation. The training loss and So I'm not sure what the issue is with E-RADIO and mmseg. Sorry I can't narrow down the issue more. I'll stick with RADIO v2.5-B as my "efficient" model for now and if you want me to submit a PR just so you can see the changes I've made so far to try and get it to work, I'm happy to do that. Thanks. |
PRs are always welcome. The ViT-B model indeed seems to be really high quality, so I hope you can make some progress with it. I'm actually generally interested in what you're up to if you'd be willing to shoot me and/or Greg an email with details, if you're interested. |
Sounds great, I'll email you and Greg to sync up on how I'm using this and we can discuss the PR then. Thanks! |
Hello,
Looks like the crop size is incorrect in the RADIO ADE20K config.
It's listed as (518, 518) here but it should be (512, 512) like it is in E-RADIO and the ADE20K base config. Using (518, 518) causes the patch projection to fail.
Thanks,
-Collin
The text was updated successfully, but these errors were encountered: