generated from rochacbruno/python-project-template
-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metaloci tools #5
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
The code has been reorganized in order to be able to invoque METALoci from the console. An entry point has been created in setup.py to run the code in __main__.py. The main() function will parse the arguments and call each program (prep, layout, lm and figure) in its own module. METALoci should now be instalable from PyPI. All help messages have been reviewed and reformatted. Minor bug fixes. Old files from templates have been deleted. A version control has been established.
Created shared variable to keep track of time and iterations, in order to be able to show progress, time spent and estimated time remaining. Added an error handling where figure.py whould crash when trying to calculate a regression when all values where identical (more likely to happen at the beggining of the chromosome). Added a check to recalculate the LMI when the user adds another signal succeding the compute of KK and LMI with less signals. This way, the user does not need to create a new working directory and compute everything again.
The log would append redundant information if executed more than once. Also, information about significant quadrants have been added to the log.
The log that was stored in MetalociObject.lmi_info can be now exported to tsv with the -l --log option. This will work even if the LMI for the regions have already been computed but the -l option was not enabled.
layout.py has now a log of bad regions. It analyzes parameters of hic, determines which regions are bad, and writes a log. ml.py has now the -l option, to write bin-based information of LMI. figure.py has now the option -l to write a bed file with the metalocis found. Fixed a -h option issue in __main__.py.
- prep.py now checks for chromosome nomenclature for both cooler and hic format. It now also checks that the given resolution for binnarizing the signal is in the Hi-C file. - lmi.py now checks if the signals you are computing are in the signals previously binnarized with prep.py. If not, warns the user and tells which signals are missing. - Added fix to missing data in signal. Until now, it was being converted to 0 and the missing signals were clustered at the end of the region. Now, the missing signals are equal to the median of the signal of that region.
- The logic of the script has changed, so if the user computes some KK and later on wants to plot them, the script does not need to compute the KK again. - The -f and -p options are handled differently so the progress bar while using multiprocessing keeps working.
- Signal plots with big numbers were not properly aligned with Hi-C matrix. A fix was added to align signals values with max up to 9999.
- Fixed issue where the option -p would not plot the KK layout in layout.py. - get_kk_plot() has now an argument to draw a circle of radious neighbourhood. signal_plot() is now working. Argument -m is a flag to select highlightning of the signal plots. If True, only the neighbouring bins from the point of interest will be highlighted (independently of the quadrant and significance of those bins, but only if the point of interest is significant). If False, all significant regions that correspond to the quadrant selected with -q will be highlighted (default: False). - Fixed MetalociObject.kk_coords() that was saving moran_index instead of bin_index. That was requiring some stupid workaround in the plot code, but not anymore. - Fixed signal not parsing correctly in load_region_signal() when provided a signal string as an argument instead of a file with signal names. - Changed file format for bed output in figure.py.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary 📝
METALoci tools. You can now install METALoci in an environment and call the different scripts from anywhere.
Details
Created a setup.py with the necessary information for it to work. Created an entry point in main.py, which will create a parser of argument and a subparser for every program, will read the arguments inside the script, parse them, populate the arguments in the actual script and run the main code of the script.
Several improvements on the program were made, described in the commits.
Bugfixes 🐛