-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
UnboundLocalError: local variable 'sequence' referenced before assignment #6
Comments
Hi, Thanks for using GNNome! I suspect that the problem lies in different versions of Raven creating slightly different GFA files, which makes parsing tricky. |
Greetings Ivrcek! I was using Raven 1.8.3 in conjunction with the DragonFlye pipeline. Happy to provide any additional information you might need to resolve the issue! Thanks again! |
I fixed it in the latest commit, it should be fine now. Also, if something breaks again during the parsing of the L-lines in the GFA, you can send me the last few lines of your GFA and I will fix that as well. Best, |
Hey Lovro! Thanks for the update! The git pull went fine, but now it has a different error. $ python create_inference_graphs.py --reads All+RatQ3.fastq --gfa raven-unpolished.gfa --asm raven --out Assembly Attaching the last 50 lines of .gfa Sorry for all the bother, |
Hey Jeff, No worries at all :) I suspect I know where the error comes from, so I pushed another commit which fixes that. If it persists, now at least it should print out a message for which exactly reads this error happens. It should say:
Just substitute Best, |
Thanks Lovro! That did it! I was able to run both the create_inference_graphs.py as well as the subsequent inference.py with no errors! The resulting assembly is about 8x smaller than it should be, so I need to ponder that a bit. But, CHEERS! I really appreciate your diligence getting things running! Jeff |
No problem, glad you got it to work! Hmm ok, I will try to take a look at why this happens and see if it's something about the parameters we use during the inference. Can I ask which genome you are trying to reconstruct? Lovro |
Sure! It's C. horridus (Timber rattlesnake). We have a pretty solid assembly that's around 1.5Gb in size. In contrast, GNNome output was 214Mb... |
I agree that's a lot shorter than it should be. I will take a look at what's going on. |
Hey, I'm trying to debug this and would appreciate your help. Could you pull the code again and try to assemble the genome? Also, if you save the output of running GNNome to, e.g., grep "Zero division error" output.log Thanks! |
Greetings Lovro, Did a git pull, then repeated the process of both create_inference_graphs and inference. Oddly enough, while the first step shows Zero division errors, the inference step shows no such errors. $ python create_inference_graphs.py --reads All+RatQ3.fastq --gfa raven-unpolished.gfa --asm raven --out Assembly Resulting assembly after inference is now 221M (should be ~1.5G). |
It seems like this is because Raven produces GFA where some edges have length 0, thus the zero division error when computing edge similarities. However it seems like this only happens for self-loops. I will leave this issue open for now and try to figure out why Raven produces such edges and if this is the cause of the short length of the assembly. Thank you for your help, Jeff. |
Sure thing, Lovro! I have 3-4 new projects being sequenced now as well. I'll give them a try once I'm a ways along and see if they behave similarly. Let me know if I can be of further assistance in the future! --jeff |
Have run the test packaged with the software successfully. But when I try to run my own data, I get the following error:
python create_inference_graphs.py --reads All+RatQ3.fastq --gfa raven-unpolished.gfa --asm raven --out Assembly
Starting to parse assembler output
Starting to loop over GFA
Traceback (most recent call last):
File "create_inference_graphs.py", line 50, in
create_inference_graph(gfa, reads, out, asm)
File "create_inference_graphs.py", line 13, in create_inference_graph
graph, pred, succ, reads, edges, read_to_node, _ = graph_parser.only_from_gfa(gfa_path, training=False, reads_path=reads_path, get_similarities=True)
File "/home/jpummil/Applications/GNNome/graph_parser.py", line 165, in only_from_gfa
if sequence == '*':
UnboundLocalError: local variable 'sequence' referenced before assignment
The referenced .fastq file assembles fine using Raven. Line 165 as referenced begins as follows:
S f890dea9-4546-4e77-aaee-6d7924f1a07d CCGAGTGCCGCCTCTGGCACACGTGCCGTAGGTTCGCCACCACTGCTATA
Something obvious I'm doing incorrectly, or perhaps just an early code issue?
The text was updated successfully, but these errors were encountered: