Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: concatenate bases with sequences #284

Closed
BioTurboNick opened this issue Aug 21, 2023 · 4 comments
Closed

Feature: concatenate bases with sequences #284

BioTurboNick opened this issue Aug 21, 2023 · 4 comments
Labels
feature request A request for some desired or missing functionality wontfix

Comments

@BioTurboNick
Copy link

dna"GTAC" * DNA_A
MethodError: no method matching *(::BioSequences.LongSequence{BioSequences.DNAAlphabet{4}}, ::BioSymbols.DNA)

This operation can be done with strings and chars in Julia, seems like the analogous should be possible.

@jakobnissen
Copy link
Member

You can do push!(dna"GTAC", DNA_A) to do this.

More broadly, Base Julia tends to conflate containers of elements with elements:

julia> iterate(5) # scalars are iterable
(5, nothing)

julia> hcat([1], 1) # scalars are equivalent to 0-dimensional tensors
1×2 Matrix{Int64}:
 1  1

julia> eltype('a') # Chars apparently contain chars???
Char

I think this is a design mistake, and I'm skeptical of bringing it into BioJulia. That's why we have:

julia> iterate(DNA_A)
ERROR: MethodError: no method matching iterate(::DNA)

julia> append!(dna"TAG", DNA_A)
ERROR: MethodError: no method matching append!(::LongSequence{DNAAlphabet{4}}, ::DNA)

julia> dna"TAG" * DNA_A
ERROR: MethodError: no method matching *(::LongSequence{DNAAlphabet{4}}, ::DNA)

@kescobo
Copy link
Member

kescobo commented Aug 21, 2023

I agree that it's a design mistake, I'm slightly less skeptical of replicating it though. I think I'm on @jakobnissen's side, but could be persuaded. I do think there's some utility in matching the semantics of Base, even when they're probably not great.

That said, I think the desire to use append!() over push!(), at least for me, is just a hold-over from python. I definitely used append!() with scalars for ages thinking it was correct. So if this is an opportunity to educate users about the right functions to use, maybe that's a good thing.

@kescobo kescobo added feature request A request for some desired or missing functionality wontfix labels Aug 21, 2023
@camilogarciabotero
Copy link
Member

Maybe having an operator that mimics the behavior. Currently it is possible to:

LongDNA{2}("ACGT") * LongDNA{2}("TGCA")
8nt DNA Sequence:
ACGTTGCA

What if we get an operator that sends the DNA_X to a LongDNA{T}([DNA_X]) and simply enable the concatenation of characters? Is this also a bad design?

@jakobnissen
Copy link
Member

You can certainly convert a biosymbol to a sequence - you just have to be explicit about it:

julia> dna"S" * LongDNA{4}([DNA_W])
2nt DNA Sequence:
SW

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature request A request for some desired or missing functionality wontfix
Projects
None yet
Development

No branches or pull requests

4 participants