Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Total hit counts added #15

Open
wants to merge 4 commits into
base: rust-bit-packed
Choose a base branch
from

Conversation

sarahsummerfield
Copy link
Contributor

No description provided.

To align with the rust files, I had to re-add the missing RNAs.fasta file to the fixtures directory and update the filename in the python count script from RNAseqs.fasta to RNAs.fasta.
Super naive implementation of feature to report total hit counts added. For each search result reported, it now also prints out the running total of hits for each DCE. This is accomplished by adding a "hits" field to the CompressedSeq struct, making the needle mutable, and incrementing its hits number every time there is a match while searching.
Adaptations are still required to create results matching the python version, i.e. writing out to a results file that contains the name of each DCE that had >= 1 hit, and the total number of hits it had.
if needle.sequence == self.haystack_window {
needle.hits = needle.hits+1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
needle.hits = needle.hits+1;
needle.hits += 1;

@@ -7,6 +7,7 @@ pub struct Seq {
pub identifier: String,
pub length: usize,
pub sequence: Vec<char>,
pub hits: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does Seq actually need this? The DNA sequences are all stored as CompressedSeqs.

This is a "functioning" version with the input arguments and print to file features added, but it is incredibly slow and total hits count doesn't work properly with the way main iterates through the sequences.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants