diff --git a/.gitignore b/.gitignore index c494de9..a6df95b 100644 --- a/.gitignore +++ b/.gitignore @@ -6,3 +6,4 @@ docs/site/ .DS_Store benchmark/tune.json Manifest.toml +*.swp diff --git a/docs/src/documentation.md b/docs/src/documentation.md index 5f32eb8..30cd1f1 100644 --- a/docs/src/documentation.md +++ b/docs/src/documentation.md @@ -20,7 +20,7 @@ downloadpdb("1EN2") To parse a PDB file into a Structure-Model-Chain-Residue-Atom framework: -```julia +```julia-repl julia> struc = read("/path/to/pdb/file.pdb", PDB) ProteinStructure 1EN2.pdb with 1 models, 1 chains (A), 85 residues, 754 atoms ``` @@ -29,7 +29,7 @@ mmCIF files can be read into the same data structure with `read("/path/to/cif/fi The keyword argument `gzip`, default `false`, determines if the file is gzipped. If you want to read an mmCIF file into a dictionary to query yourself (e.g. to access metadata fields), use [`MMCIFDict`](@ref): -```julia +```julia-repl julia> mmcif_dict = MMCIFDict("/path/to/cif/file.cif") mmCIF dictionary with 716 fields @@ -144,6 +144,7 @@ This can be changed by setting `expand_disordered` to `true` in [`collectatoms`] Selectors are functions passed as additional arguments to these functions. Only elements that return `true` when passed to all the selector are retained. +See also the selection syntax [described below](@ref String-selection-syntax). For example: | Command | Action | Return type | @@ -199,7 +200,7 @@ collectatoms(struc, at -> x(at) < 0) [`countatoms`](@ref), [`countresidues`](@ref), [`countchains`](@ref) and [`countmodels`](@ref) can be used to count elements with the same selector API. For example: -```julia +```julia-repl julia> countatoms(struc) 754 @@ -215,7 +216,7 @@ julia> countatoms(struc, expand_disordered=true) The amino acid sequence of a protein can be retrieved by passing an element to [`LongAA`](@ref) with optional residue selectors: -```julia +```julia-repl julia> LongAA(struc['A'], standardselector) 85aa Amino Acid Sequence: RCGSQGGGSTCPGLRCCSIWGWCGDSEPYCGRTCENKCW…RCGAAVGNPPCGQDRCCSVHGWCGGGNDYCSGGNCQYRC @@ -227,7 +228,7 @@ See [BioSequences.jl](https://github.com/BioJulia/BioSequences.jl) and [BioAlign [`LongAA`](@ref) is an alias for `LongSequence{AminoAcidAlphabet}`. For example, to see the alignment of CDK1 and CDK2 (this example also makes use of Julia's [broadcasting](https://docs.julialang.org/en/v1/manual/arrays/#Broadcasting-1)): -```julia +```julia-repl julia> struc1, struc2 = retrievepdb.(["4YC6", "1HCL"]) 2-element Array{ProteinStructure,1}: ProteinStructure 4YC6.pdb with 1 models, 8 chains (A,B,C,D,E,F,G,H), 1420 residues, 12271 atoms @@ -270,6 +271,99 @@ PairwiseAlignmentResult{Int64, LongAA, LongAA}: In fact, [`pairalign`](@ref) is extended to carry out the above steps and return the alignment by calling `pairalign(struc1["A"], struc2["A"], standardselector)` in this case. `scoremodel` and `aligntype` are keyword arguments with the defaults shown above. +## String selection syntax + +!!! compat + The string-selection syntax was introduced in version 3.13. + +`BioStructures.jl` exports the `sel` macro that provides a practical way to collect atoms from a structure +using a natural selection syntax. It must be used as: +```julia +collectatoms(struc, sel"selection string") +``` +where `struc` is the input structure and `selection string` defines the atoms to be selected, with the +operators and keyword defined below. + +For example: + +```julia-repl +julia> struc = retrievepdb("4YC6") +[ Info: Downloading file from PDB: 4YC6 +ProteinStructure 4YC6.pdb with 1 models, 8 chains (A,B,C,D,E,F,G,H), 1420 residues, 12271 atoms + +julia> collectatoms(struc, sel"name CA and resnumber <= 5") +24-element Vector{AbstractAtom}: + Atom CA with serial 2, coordinates [17.363, 31.409, -27.535] + Atom CA with serial 10, coordinates [20.769, 32.605, -28.801] + ⋮ + Atom CA with serial 11096, coordinates [-8.996, 6.094, -29.097] +``` + +There are also macro-keywords to select groups of atoms with specific properties. For example: + +```julia-repl +julia> ats = collectatoms(struc, sel"acidic and name N") +188-element Vector{AbstractAtom}: + Atom N with serial 9, coordinates [19.33, 32.429, -28.593] + Atom N with serial 18, coordinates [21.056, 33.428, -26.564] + ⋮ + Atom N with serial 11603, coordinates [-0.069, 21.516, -32.604] + +julia> resname.(ats) +188-element Vector{SubString{String}}: + "GLU" + "ASP" + ⋮ + "GLU" +``` + +The current supported operators are: + +| Operators | Acts on | +| :-------------------------- | :---------------------------------- | +| `=`, `>`, `<` `>=`, `<=` | Properties with real values. | +| `and`, `or`, `not` | Selections subsets. | + +The keywords supported are: + + +| Keyword | Input type | Selects for | +| :-------------------------- | :-----------:|:---------------------| +| `index` | `Int` | `serial` | +| `serial` | `Int` | `serial` | +| `resnumber` | `Int` | `resnumber` | +| `resnum` | `Int` | `resnumber` | +| `resid` | `Int` | `resid` | +| `occupancy` | `Real` | `occupancy` | +| `beta` | `Real` | `tempfactor` | +| `tempfactor` | `Real` | `tempfactor` | +| `model` | `Int` | `modelnumber` | +| `modelnumber` | `Int` | `modelnumber` | +| `name` | `String` | `atomname` | +| `atomname` | `String` | `atomname` | +| `segname` | `String` | `segname` | +| `resname` | `String` | `resname` | +| `chain` | `String` | `chainid` | +| `chainid` | `String` | `chainid` | +| `element` | `String` | `element` | +| `water` | | Water molecules | +| `protein` | | Protein atoms | +| `polar` | | Polar residues | +| `nonpolar` | | Non-polar residues | +| `basic` | | Basic residues | +| `acidic` | | Acidic residues | +| `charged` | | Charged residues | +| `aliphatic` | | Aliphatic residues | +| `aromatic` | | Aromatic residues | +| `hydrophobic` | | Hydrophobic residues | +| `neutral` | | Neutral residues | +| `backbone` | | Backbone atoms | +| `heavyatom` | | Heavy atoms | +| `disordered` | | Disordered atoms | +| `sidechain` | | Side-chain atoms | +| `all` | | all atoms | + + ## Spatial calculations Various functions are provided to calculate spatial quantities for proteins: @@ -304,7 +398,7 @@ The [`omegaangle`](@ref) and [`phiangle`](@ref) functions measure the angle betw The [`psiangle`](@ref) function measures between the given index and the one after. For example: -```julia +```julia-repl julia> distance(struc['A'][10], struc['A'][20]) 10.782158874733762 @@ -325,7 +419,7 @@ julia> rad2deg(psiangle(struc['A'], 50)) [`ContactMap`](@ref) can also be given two structural elements as arguments, in which case a non-symmetrical 2D array is returned showing contacts between the elements. The underlying `BitArray{2}` for [`ContactMap`](@ref) `contacts` can be accessed with `contacts.data` if required. -```julia +```julia-repl julia> contacts = ContactMap(collectatoms(struc['A'], cbetaselector), 8.0) Contact map of size (85, 85) ``` @@ -387,7 +481,7 @@ The contacting elements in a molecular structure form a graph, and this can be r This extends `MetaGraph` from [MetaGraphs.jl](https://github.com/JuliaGraphs/MetaGraphs.jl), allowing you to use all the graph analysis tools in [Graphs.jl](https://github.com/JuliaGraphs/Graphs.jl). For example: -```julia +```julia-repl julia> using Graphs, MetaGraphs julia> mg = MetaGraph(collectatoms(struc["A"], cbetaselector), 8.0) @@ -488,7 +582,7 @@ In this case download the mmCIF file or MMTF file instead. To parse an existing PDB file into a Structure-Model-Chain-Residue-Atom framework: -```julia +```julia-repl julia> struc = read("/path/to/pdb/file.pdb", PDB) ProteinStructure 1EN2.pdb with 1 models, 1 chains (A), 85 residues, 754 atoms ``` @@ -506,7 +600,7 @@ Various options can be set through optional keyword arguments when parsing PDB/m Use [`retrievepdb`](@ref) to download and parse a PDB file into a Structure-Model-Chain-Residue-Atom framework in a single line: -```julia +```julia-repl julia> struc = retrievepdb("1ALW", dir="path/to/pdb/directory") INFO: Downloading PDB: 1ALW ProteinStructure 1ALW.pdb with 1 models, 2 chains (A,B), 346 residues, 2928 atoms @@ -514,7 +608,7 @@ ProteinStructure 1ALW.pdb with 1 models, 2 chains (A,B), 346 residues, 2928 atom If you prefer to work with data frames rather than the data structures in BioStructures, the `DataFrame` constructor from [DataFrames.jl](https://github.com/JuliaData/DataFrames.jl) has been extended to construct relevant data frames from lists of atoms or residues: -```julia +```julia-repl julia> using DataFrames julia> df = DataFrame(collectatoms(struc)); @@ -547,7 +641,7 @@ As with file writing disordered entities are expanded by default but this can be You can read and write files containing multiple mmCIF data blocks (equivalent to a `MMCIFDict` in this package) with the [`readmultimmcif`](@ref) and [`writemultimmcif`](@ref) functions. An example of such a file is the PDB's [chemical component dictionary](https://www.wwpdb.org/data/ccd). -```julia +```julia-repl julia> ccd = readmultimmcif("components.cif.gz"; gzip=true); julia> ccd["2W4"] @@ -590,14 +684,14 @@ Multi-character chain IDs can be written to mmCIF and MMTF files but will throw If you want the PDB record line for an [`AbstractAtom`](@ref), use [`pdbline`](@ref). For example: -```julia +```julia-repl julia> pdbline(at) "HETATM 101 C A X B 20 10.500 20.123 -5.123 0.50 50.13 C1+" ``` If you want to generate a PDB record line from values directly, do so using an [`AtomRecord`](@ref): -```julia +```julia-repl julia> pdbline(AtomRecord(false, 669, "CA", ' ', "ILE", "A", 90, ' ', [31.743, 33.11, 31.221], 1.00, 25.76, "C", "")) "ATOM 669 CA ILE A 90 31.743 33.110 31.221 1.00 25.76 C " ``` diff --git a/src/BioStructures.jl b/src/BioStructures.jl index 78fa978..b00e80e 100644 --- a/src/BioStructures.jl +++ b/src/BioStructures.jl @@ -31,6 +31,7 @@ include("mmcif.jl") include("mmtf.jl") include("spatial.jl") include("secondary.jl") +include("select.jl") if !isdefined(Base, :get_extension) include("../ext/BioStructuresDataFramesExt.jl") diff --git a/src/select.jl b/src/select.jl new file mode 100644 index 0000000..55ad637 --- /dev/null +++ b/src/select.jl @@ -0,0 +1,247 @@ +# +# Function to perform the most important selections that are used in solute-solvent +# analysis +# +export @sel_str + +# +# Dictionaries of residue, atom, and element types, as originally written +# in PDBTools.jl (at first sight, seems more general that the current) +# +# +# Data for natural protein residues +# +Base.@kwdef struct AminoAcidResidue + name::String + three_letter_code::String + one_letter_code::String + type::String + polar::Bool + hydrophobic::Bool + mono_isotopic_mass::Float64 + mass::Float64 + charge::Int + custom::Bool = true +end +#! format: off +const protein_residues = Dict{String,AminoAcidResidue}( + "ALA" => AminoAcidResidue("Alanine", "ALA", "A", "Aliphatic", false, false, 71.037114, 71.0779, 0, false), + "ARG" => AminoAcidResidue("Arginine", "ARG", "R", "Basic", true, false, 156.101111, 156.1857, 1, false), + "ASN" => AminoAcidResidue("Asparagine", "ASN", "N", "Amide", true, false, 114.042927, 114.1026, 0, false), + "ASP" => AminoAcidResidue("Aspartic acid", "ASP", "D", "Acidic", true, false, 115.026943, 115.0874, -1, false), + "CYS" => AminoAcidResidue("Cysteine", "CYS", "C", "Sulfuric", false, false, 103.009185, 103.1429, 0, false), + "GLN" => AminoAcidResidue("Glutamine", "GLN", "Q", "Amide", true, false, 128.058578, 128.1292, 0, false), + "GLU" => AminoAcidResidue("Glutamic acid", "GLU", "E", "Acidic", true, false, 129.042593, 129.1140, -1, false), + "GLY" => AminoAcidResidue("Glycine", "GLY", "G", "Aliphatic", false, false, 57.021464, 57.0513, 0, false), + "HIS" => AminoAcidResidue("Histidine", "HIS", "H", "Aromatic", true, false, 137.058912, 137.1393, 0, false), + "ILE" => AminoAcidResidue("Isoleucine", "ILE", "I", "Aliphatic", false, true, 113.084064, 113.1576, 0, false), + "LEU" => AminoAcidResidue("Leucine", "LEU", "L", "Aliphatic", false, true, 113.084064, 113.1576, 0, false), + "LYS" => AminoAcidResidue("Lysine", "LYS", "K", "Basic", true, false, 128.094963, 128.1723, 1, false), + "MET" => AminoAcidResidue("Methionine", "MET", "M", "Sulfuric", false, false, 131.040485, 131.1961, 0, false), + "PHE" => AminoAcidResidue("Phenylalanine", "PHE", "F", "Aromatic", false, true, 147.068414, 147.1739, 0, false), + "PRO" => AminoAcidResidue("Proline", "PRO", "P", "Cyclic", false, false, 97.052764, 97.1152, 0, false), + "SER" => AminoAcidResidue("Serine", "SER", "S", "Hydroxylic", true, false, 87.032028, 87.07730, 0, false), + "THR" => AminoAcidResidue("Threonine", "THR", "T", "Hydroxylic", true, false, 101.047679, 101.1039, 0, false), + "TRP" => AminoAcidResidue("Tryptophan", "TRP", "W", "Aromatic", false, true, 186.079313, 186.2099, 0, false), + "TYR" => AminoAcidResidue("Tyrosine", "TYR", "Y", "Aromatic", true, false, 163.063320, 163.1733, 0, false), + "VAL" => AminoAcidResidue("Valine", "VAL", "V", "Aliphatic", false, true, 99.068414, 99.1311, 0, false), + # Alternate protonation states for CHARMM and AMBER + "ASPP" => AminoAcidResidue("Aspartic acid (protonated)", "ASP", "D", "Acidic", true, false, 115.026943, 115.0874, 0, false), + "GLUP" => AminoAcidResidue("Glutamic acid (protonated)", "GLU", "E", "Acidic", true, false, 129.042593, 129.1140, 0, false), + "HSD" => AminoAcidResidue("Histidine (D)", "HIS", "H", "Aromatic", true, false, 137.058912, 137.1393, 0, false), + "HSE" => AminoAcidResidue("Histidine (E)", "HIS", "H", "Aromatic", true, false, 137.058912, 137.1393, 0, false), + "HSP" => AminoAcidResidue("Histidine (doubly protonated)", "HIS", "H", "Aromatic", true, false, 137.058912, 137.1393, 1, false), + "HID" => AminoAcidResidue("Histidine (D)", "HIS", "H", "Aromatic", true, false, 137.058912, 137.1393, 0, false), + "HIE" => AminoAcidResidue("Histidine (E)", "HIS", "H", "Aromatic", true, false, 137.058912, 137.1393, 0, false), + "HIP" => AminoAcidResidue("Histidine (doubly protonated)", "HIS", "H", "Aromatic", true, false, 137.058912, 137.1393, 1, false), +) +#! format: on +isprotein(atom::AbstractAtom) = haskey(protein_residues, resname(atom)) +isacidic(atom::AbstractAtom) = isprotein(atom) && protein_residues[resname(atom)].type == "Acidic" +isaliphatic(atom::AbstractAtom) = isprotein(atom) && protein_residues[resname(atom)].type == "Aliphatic" +isaromatic(atom::AbstractAtom) = isprotein(atom) && protein_residues[resname(atom)].type == "Aromatic" +isbasic(atom::AbstractAtom) = isprotein(atom) && protein_residues[resname(atom)].type == "Basic" +ischarged(atom::AbstractAtom) = isprotein(atom) && protein_residues[resname(atom)].charge != 0 +isneutral(atom::AbstractAtom) = isprotein(atom) && protein_residues[resname(atom)].charge == 0 +ishydrophobic(atom::AbstractAtom) = isprotein(atom) && protein_residues[resname(atom)].hydrophobic +ispolar(atom::AbstractAtom) = isprotein(atom) && protein_residues[resname(atom)].polar +isnonpolar(atom::AbstractAtom) = isprotein(atom) && !ispolar(atom) +const backbone_atoms = ["N", "CA", "C", "O"] +isbackbone(atom::AbstractAtom; backbone_atoms=backbone_atoms) = isprotein(atom) && atomname(atom) in backbone_atoms +const not_side_chain_atoms = ["N", "CA", "C", "O", "HN", "H", "HA", "HT1", "HT2", "HT3"] +issidechain(atom::AbstractAtom; not_side_chain_atoms=not_side_chain_atoms) = isprotein(atom) && !(atomname(atom) in not_side_chain_atoms) +const water_residues = ["HOH", "OH2", "TIP", "TIP3", "TIP3P", "TIP4P", "TIP5P", "TIP7P", "SPC", "SPCE"] +iswater(atom::AbstractAtom) = strip(resname(atom)) in water_residues + +#= + Select + +This structure acts a function when used within typical julia filtering functions, +by converting a string selection into a call to query call. + +=# +struct Select <: Function + sel::String +end +(s::Select)(at) = apply_query(parse_query(s.sel), at) +macro sel_str(str) + Select(str) +end + +# collect atoms function +function collectatoms(struc::StructuralElementOrList, sel::Select) + atoms = collectatoms(struc) + return filter!(sel, atoms) +end + +# Comparison operators +const operators = ( + "=" => (x, y) -> isequal(x, y), + "<" => (x, y) -> isless(x, y), + ">" => (x, y) -> isless(y, x), + "<=" => (x, y) -> (!isless(y, x)), + ">=" => (x, y) -> (!isless(x, y)), +) + +# +# Keywords +# +struct Keyword{F<:Function} + ValueType::Type + name::String + getter::F + operators::Tuple +end + +# +# Macro keywords (functions without parameters) +# +struct MacroKeyword{F<:Function} + name::String + getter::F +end +(key::MacroKeyword)(::AbstractVector{<:AbstractString}) = key.getter + +function (key::Keyword)(s::AbstractVector{<:AbstractString}) + # if 1.6 compatibility is not needed, this can be replaced by + # (; getter, operators) = key + getter, operators = key.getter, key.operators + for op in operators + if (i = findfirst(==(op.first), s)) !== nothing + return el -> op.second(getter(el), parse_to_type(key, s[i+1])) + end + end + # If no operator was found, assume that `=` was intended + return el -> isequal(getter(el), parse_to_type(key, s[1])) +end + +#= + parse_to_type(key::Keyword, val::String) + +Tries to parse `val` into the type of value expected by `key.ValueType`. + +=# +function parse_to_type(key::Keyword, val) + if key.ValueType == String + return val + end + try + val = parse(key.ValueType, val) + return val + catch + throw(ArgumentError("Could not parse $val for keyword $(key.name), expected $(key.ValueType)")) + end +end + +# +# Keywords +# +keywords = [ + Keyword(Int, "index", serial, operators), + Keyword(Int, "serial", serial, operators), + Keyword(Int, "resnumber", resnumber, operators), + Keyword(Int, "resnum", resnumber, operators), + Keyword(Int, "resid", resid, operators), + Keyword(Float64, "occupancy", occupancy, operators), + Keyword(Float64, "beta", tempfactor, operators), + Keyword(Float64, "tempfactor", tempfactor, operators), + Keyword(Int, "model", modelnumber, operators), + Keyword(Int, "modelnumber", modelnumber, operators), + Keyword(String, "name", atomname, operators), + Keyword(String, "atomname", atomname, operators), + #Keyword(String, "segname", segname, operators), + Keyword(String, "resname", resname, operators), + Keyword(String, "chain", chainid, operators), + Keyword(String, "chainid", chainid, operators), + Keyword(String, "element", element, operators), + MacroKeyword("water", iswater), + MacroKeyword("protein", isprotein), + MacroKeyword("polar", ispolar), + MacroKeyword("nonpolar", isnonpolar), + MacroKeyword("basic", isbasic), + MacroKeyword("acidic", isacidic), + MacroKeyword("charged", ischarged), + MacroKeyword("aliphatic", isaliphatic), + MacroKeyword("aromatic", isaromatic), + MacroKeyword("hydrophobic", ishydrophobic), + MacroKeyword("neutral", isneutral), + MacroKeyword("backbone", isbackbone), + MacroKeyword("heavyatom", heavyatomselector), + MacroKeyword("disordered", isdisorderedatom), + MacroKeyword("sidechain", issidechain), + MacroKeyword("all", el -> true), +] + +# +# parse_query and apply_query are a very gentle contribution given by +# CameronBieganek in https://discourse.julialang.org/t/parsing-selection-syntax/43632/9 +# while explaining to me how to creat a syntex interpreter +# + +#= + parse_query(selection:String) + +Calls `parse_query_vector` after splitting the selection string. + +=# +parse_query(selection::String) = parse_query_vector(split(selection)) + +#= + + parse_query_vector(s::AbstractVector{<:AbstractString}) + +=# +function parse_query_vector(s) + # or, and, not + if (i = findfirst(==("or"), s)) !== nothing + deleteat!(s, i) + (|, parse_query_vector.((s[1:i-1], s[i:end]))...) + elseif (i = findfirst(==("and"), s)) !== nothing + deleteat!(s, i) + (&, parse_query_vector.((s[1:i-1], s[i:end]))...) + elseif (i = findfirst(==("not"), s)) !== nothing + deleteat!(s, i) + (!, parse_query_vector(s[i:end])) + # keywords + else + for key in keywords + if (i = findfirst(==(key.name), s)) !== nothing + deleteat!(s, i) + return key(s) + end + end + throw(ArgumentError(("Unable to parse selection string."))) + end +end + +function apply_query(q, a) + if !(q isa Tuple) + q(a) + else + f, args = Iterators.peel(q) + f(apply_query.(args, Ref(a))...) + end +end + + + diff --git a/test/runtests.jl b/test/runtests.jl index 7105a14..4d14858 100644 --- a/test/runtests.jl +++ b/test/runtests.jl @@ -3294,4 +3294,50 @@ end rm(temp_filename, force=true) rm(temp_dir, recursive=true, force=true) +@testset "Selection syntax" begin + + struc = retrievepdb("4YC6") + + @test length(collectatoms(struc, sel"all")) == 12271 + @test length(collectatoms(struc, sel"name CA")) == 1420 + sel = collectatoms(struc, sel"index = 13") + @test length(sel) == 1 + @test serial(sel[1]) == 13 + + @test length(collectatoms(struc, sel"index > 1 and index < 13")) == 11 + @test length(collectatoms(struc, sel"index >= 1 and index <= 13")) == 13 + @test length(collectatoms(struc, sel"protein")) == 11632 + @test length(collectatoms(struc, sel"water")) == 639 + @test length(collectatoms(struc, sel"resname GLY")) == 320 + @test length(collectatoms(struc, sel"protein and resnum = 2")) == 36 + @test length(collectatoms(struc, sel"neutral")) == 8300 + @test length(collectatoms(struc, sel"charged")) == 3332 + @test length(collectatoms(struc, sel"sidechain")) == 5952 + @test length(collectatoms(struc, sel"acidic")) == 1604 + @test length(collectatoms(struc, sel"basic")) == 1728 + @test length(collectatoms(struc, sel"hydrophobic")) == 3604 + @test length(collectatoms(struc, sel"not hydrophobic")) == 8667 + @test length(collectatoms(struc, sel"aliphatic")) == 3276 + @test length(collectatoms(struc, sel"aromatic")) == 2340 + @test length(collectatoms(struc, sel"polar")) == 6544 + @test length(collectatoms(struc, sel"nonpolar")) == 5088 + @test length(collectatoms(struc, sel"backbone")) == 5680 + @test length(collectatoms(struc, sel"element H")) == 0 + @test length(collectatoms(struc, sel"name CA or element S")) == 1464 + @test length(collectatoms(struc, sel"disordered")) == 68 + + # Missing: + # segment + #@test length(filter(sel"segname PROT", struc)) == 1463 + # residue should be the incremental residue number + #@test length(filter(sel"residue = 2", struc)) == 11 + + # input errors: invalid selection syntax + @test_throws ArgumentError collectatoms(struc, sel"abc") + # input errors: invalid value type + @test_throws ArgumentError collectatoms(struc, sel"index = A") + @test_throws ArgumentError collectatoms(struc, sel"resnum C") + +end + end # TestBioStructures