You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
So, there's a 4.5% overlap in k-mers with some Buchnera queries for JAGGDU010000001.1 and NZ_JPOS01000001.1, which is probably not enough to call it a Buchnera genome... The estimated ANI is in the 80% range (column max_containment_ani).
tl;dr no matches to Sulcia, only one match to Buchnera, in all the ATB genomes.
Command being timed: "sourmash scripts manysearch sublineages.sig.zip allthebacteria-r0.2-k21.zip -o sublineages.x.atb.csv.2 -k 21 -c 4"
User time (seconds): 10967.28
System time (seconds): 98.50
Percent of CPU this job got: 385%
Elapsed (wall clock) time (h:mm:ss or m:ss): 47:51.17
Average shared text size (kbytes): 0
Average unshared data size (kbytes): 0
Average stack size (kbytes): 0
Average total size (kbytes): 0
Maximum resident set size (kbytes): 20568304
Average resident set size (kbytes): 0
Major (requiring I/O) page faults: 506284
Minor (reclaiming a frame) page faults: 4419875
Voluntary context switches: 63584
Involuntary context switches: 120076
Swaps: 0
File system inputs: 129128808
File system outputs: 21096
Socket messages sent: 0
Socket messages received: 0
Signals delivered: 0
Page size (bytes): 4096
Exit status: 0
The text was updated successfully, but these errors were encountered:
ATB issue: AllTheBacteria/AllTheBacteria#54 (comment)
hackmd for this issue: https://hackmd.io/-7KJTe14TZGooSW3TAhlgw?both
AllTheBacteria Buchnera and Sulcia search w/sourmash
ATB download: #3247
GTDB files: #3183 (comment)
Build a picklist for the GTDB genomes we're interested in (Sulcia and Buchnera):
Extract just those genomes from GTDB RS220:
Search ATB v0.2 for overlaps:
(Takes about 10-15 minutes )
Parse results:
Eliminate all ATB genomes already labeled as Buchnera:
So, there's a 4.5% overlap in k-mers with some Buchnera queries for
JAGGDU010000001.1
andNZ_JPOS01000001.1
, which is probably not enough to call it a Buchnera genome... The estimated ANI is in the 80% range (columnmax_containment_ani
).tl;dr no matches to Sulcia, only one match to Buchnera, in all the ATB genomes.
Download the CSV file here for more exploration: https://farm.cse.ucdavis.edu/~ctbrown/buchnera-sulcia.manysearch.zip
benchmark info
with 4 CPUs:
The text was updated successfully, but these errors were encountered: