-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in specific samples #9
Comments
Trying to install in other machine I've got other errors
|
Hi Lucas-Maciel, Did you find a solution for the second error that you got in March (KEGG has forbidden request after 10 attempts). I am trying to run AMON and I get the same error message. |
@Tim-Sto, to tell the truth, I still don't really understand why but I was able to make it work. if you try to run like 10 times the same input, one of them may work. Another thing that sometimes works and I also don't know why is, if you have a file with 1000 lines and it is not working you can use |
@Lucas-Maciel, Thank you very much for your answer. I tried to run the same input 10 times and I also tried to use the head command, but it both did not work. However, I think, I figured out what the problem is. I used a KO list based on the human genome (more than 10.000 KOs) as input for the host. As mentioned in the code, the KEGG API has download limits for those not having a subscription, and probably the limits are reached with a list of 10.000 KOs. @kthurimella, did you already find a workaround for this problem? |
Hey @Lucas-Maciel and @Tim-Sto, We have looked for ways around this but we have never been able to find the limitations of the KEGG API. In some documentation they mention that it is a rate limitation (e.g. no more than 1000 requests per minute) but they never say what the rate is. My recommendation is to run subsets like @Lucas-Maciel said. If we knew what the KEGG API limits were we could set up AMON to only poll their servers within this limit but since they don't all we can do is guess. We haven't found any better parameters than the ones set as default in AMON to get around it. You can also try using the Sorry about the lack of an answer but it seems suprisingly hard to find info in this area. Mike |
Same problem, --ko_file_loc KO_FILE_LOC In order to not requesting KEGG ? Thank's by advance |
Hi all, I believe I've fixed this issue with the latest release of KEGG_Parser (which is now bumped to 0.0.7 to fix pip compatibility issues). If the asynchronous downloads are forbidden (due to the request rate being too high), it will download the each url from the KEGG API sequentially. This is quite a bit slower, but it does get around the issue. |
Hello @sterrettJD thank you for updating KEGG_Parser! I am using version 0.0.7 but unfortunately I am getting the same error as @Lucas-Maciel. I am thinking of downloading the KEGG FTP files. Where can I find those? @vindarbot were you able to locate them? Thanks! :)
|
Hey @raeshrode , that's weird - I'll look into it! In the meantime, can you post the error from your computer + all the versions for your packages (output of Regarding the KEGG FTP, those files can be accessed here, but unfortunately you need to be a KEGG subscriber to download them :/ which is why we have to download things from KEGG individually |
Thank you for the quick response @sterrettJD ! Bummer on the KEGG subscription, but thank you for the link to that too. My AMON environment packages and versions:
My error:
Thank you! |
Hey @raeshrode , it looks like KEGG_parser is requesting a weird url... In that last line,
(htto -> http; kega -> kegg; ip -> jp; aet -> get) I haven't seen this before, and I'm not sure how this string is getting corrupted. Would you be able to email me the command/input data you're using for AMON ([email protected])? I can see if I get the same error on my end. I could be wrong, but I think that this may be a different issue from what Lucas was dealing with. In this case, AMON is attempting to download the KEGG data in parallel, then when that fails, it's attempting to download the data not in parallel. Lucas's error was due to hitting limits in the number of requests allowed per minute by KEGG, but this error seems to be related to some corruption of the URL string requested... |
I tested with @raeshrode 's data and was getting the 403 error but no weird url. I think KEGG now "forbids" requests for longer once a requester is "banned"... That means that the strategy of Anyway, I've updated KEGG_parser to have an option to not try the parallel downloading that seems to be causing the issue, and I've changed the default behavior of AMON to skip the parallel download attempt. Parallel downloading in AMON can be re-enabled using Rachel, can you try updating AMON -> v1.0.1 and kegg_parser -> v0.0.8, and see if that fixes things? It does on my end (with your data). There's an error downstream when calculating enrichment, but that may be because you're only using one species for the microbial side. I'm hoping/assuming that'll go away once you add more taxa into the mix. |
Hi,
I'm using AMON in my metagenomic data. I have 79 MAGs, and in 70 I was able to run it without problems. But for 9 of them I get the following error. I believe that it may be due to a non-recognized KO annotation, but I don't know how to figure out which ones.
amon.py -i ko_list.txt -o ../teste Traceback (most recent call last): File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/bin/amon.py", line 74, in <module> main(kos_loc, output_dir, other_kos_loc, detected_compounds, name1, name2, keep_separated, samples_are_columns, File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/AMON/predict_metabolites.py", line 283, in main ko_dict = get_kegg_record_dict(set(all_kos), parse_ko, ko_file_loc) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 55, in get_kegg_record_dict records = get_from_kegg_api(loop, list_of_ids, parser) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 49, in get_from_kegg_api return [parser(raw_record) for raw_record in loop.run_until_complete(kegg_download_manager(loop, list_of_ids))] File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/asyncio/base_events.py", line 616, in run_until_complete return future.result() File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 43, in kegg_download_manager results = await asyncio.gather(*tasks) File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/KEGG_parser/downloader.py", line 30, in download_coroutine return await response.text() File "/home/ABTLUS/lucas.maciel/anaconda3/envs/AMON/lib/python3.8/site-packages/aiohttp/client_reqrep.py", line 1014, in text return self._body.decode(encoding, errors=errors) # type: ignore UnicodeDecodeError: 'utf-8' codec can't decode byte 0xa0 in position 80011: invalid start byte
The text was updated successfully, but these errors were encountered: