Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Metaloci tools #5

Merged
merged 11 commits into from
Jul 21, 2023
Merged

Metaloci tools #5

merged 11 commits into from
Jul 21, 2023

Conversation

leozuber
Copy link
Collaborator

Summary 📝

METALoci tools. You can now install METALoci in an environment and call the different scripts from anywhere.

Details

Created a setup.py with the necessary information for it to work. Created an entry point in main.py, which will create a parser of argument and a subparser for every program, will read the arguments inside the script, parse them, populate the arguments in the actual script and run the main code of the script.
Several improvements on the program were made, described in the commits.

Bugfixes 🐛

  • Several bug fixes, described in the commits.

Leo Zuber and others added 11 commits July 11, 2023 16:38
The code has been reorganized in order to be able to invoque METALoci
from the console.

An entry point has been created in setup.py to run the code in
 __main__.py. The main() function will parse the arguments and call
 each program (prep, layout, lm and figure) in its own module.
 METALoci should now be instalable from PyPI.

 All help messages have been reviewed and reformatted.
 Minor bug fixes.
 Old files from templates have been deleted.
 A version control has been established.
Created shared variable to keep track of time and iterations,
in order to be able to show progress, time spent and estimated
time remaining.
Added an error handling where figure.py whould crash when trying
to calculate a regression when all values where identical (more likely
to happen at the beggining of the chromosome).
Added a check to recalculate the LMI when the user adds another signal
succeding the compute of KK and LMI with less signals. This way, the
user does not need to create a new working directory and compute
everything again.
The log would append redundant information if executed more than once.
Also, information about significant quadrants have been added to the log.
The log that was stored in MetalociObject.lmi_info can be now
exported to tsv with the -l --log option. This will work even if the
LMI for the regions have already been computed but the -l option
was not enabled.
layout.py has now a log of bad regions. It analyzes parameters of
hic, determines which regions are bad, and writes a log.
ml.py has now the -l option, to write bin-based information of LMI.
figure.py has now the option -l to write a bed file with the metalocis
found.
Fixed a -h option issue in __main__.py.
- prep.py now checks for chromosome nomenclature for both cooler and
hic format. It now also checks that the given resolution for binnarizing
the signal is in the Hi-C file.
- lmi.py now checks if the signals you are computing are in the signals
previously binnarized with prep.py. If not, warns the user and tells
which signals are missing.
- Added fix to missing data in signal. Until now, it was being converted
to 0 and the missing signals were clustered at the end of the region.
Now, the missing signals are equal to the median of the signal of that
region.
- The logic of the script has changed, so if the user computes some KK
and later on wants to plot them, the script does not need to compute
the KK again.
- The -f and -p options are handled differently so the progress bar
while using multiprocessing keeps working.
- Signal plots with big numbers were not properly aligned with Hi-C
matrix. A fix was added to align signals values with max up to 9999.
- Fixed issue where the option -p would not plot the KK layout in layout.py.
- get_kk_plot() has now an argument to draw a circle of radious neighbourhood.
signal_plot() is now working. Argument -m is a flag to select highlightning
of the signal plots. If True, only the neighbouring bins from the point of
interest will be highlighted (independently of the quadrant and significance
of those bins, but only if the point of interest is significant). If False,
all significant regions that correspond to the quadrant selected with -q will
be highlighted (default: False).
- Fixed MetalociObject.kk_coords() that was saving moran_index instead of
bin_index. That was requiring some stupid workaround in the plot code,
but not anymore.
- Fixed signal not parsing correctly in load_region_signal() when provided
a signal string as an argument instead of a file with signal names.
- Changed file format for bed output in figure.py.
@leozuber leozuber merged commit b871224 into main Jul 21, 2023
4 of 5 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant