diff --git a/dev/.documenter-siteinfo.json b/dev/.documenter-siteinfo.json index bbccada..02af203 100644 --- a/dev/.documenter-siteinfo.json +++ b/dev/.documenter-siteinfo.json @@ -1 +1 @@ -{"documenter":{"julia_version":"1.10.3","generation_timestamp":"2024-05-20T04:14:52","documenter_version":"1.4.0"}} \ No newline at end of file +{"documenter":{"julia_version":"1.10.3","generation_timestamp":"2024-05-23T12:58:09","documenter_version":"1.4.0"}} \ No newline at end of file diff --git a/dev/api/index.html b/dev/api/index.html index 179b023..b4780b1 100644 --- a/dev/api/index.html +++ b/dev/api/index.html @@ -1,5 +1,5 @@ -API · BioStructures.jl

BioStructures API

On Julia 1.9 and later the DataFrame and MetaGraph constructors are in package extensions in order to reduce the number of dependencies. Call using DataFrames and using Graphs, MetaGraphs respectively to access these functions.

BioSequences.LongAAMethod
LongAA(el)

Return the amino acid sequence of a structural element.

Additional arguments are residue selector functions - only residues that return true from all the functions are retained. The gaps keyword argument determines whether to add gaps to the sequence based on missing residue numbers (default true). See BioSequences.jl for more on how to use sequences. LongAA is an alias for LongSequence{AminoAcidAlphabet}.

source
BioStructures.ContactMapType
ContactMap(element, contact_distance)
+API · BioStructures.jl

BioStructures API

On Julia 1.9 and later the DataFrame and MetaGraph constructors are in package extensions in order to reduce the number of dependencies. Call using DataFrames and using Graphs, MetaGraphs respectively to access these functions.

BioSequences.LongAAMethod
LongAA(el)

Return the amino acid sequence of a structural element.

Additional arguments are residue selector functions - only residues that return true from all the functions are retained. The gaps keyword argument determines whether to add gaps to the sequence based on missing residue numbers (default true). See BioSequences.jl for more on how to use sequences. LongAA is an alias for LongSequence{AminoAcidAlphabet}.

source
BioStructures.ContactMapType
ContactMap(element, contact_distance)
 ContactMap(element_one, element_two, contact_distance)
 ContactMap(bit_array_2D)

Calculate the contact map for a StructuralElementOrList, or between two StructuralElementOrLists.

This returns a ContactMap type containing a BitArray{2} with true where the sub-elements are no further than the contact distance and false otherwise. When one element is given as input this returns a symmetric square matrix. To directly access the underlying data of ContactMap cm, use cm.data.

Examples

cbetas_A = collectatoms(struc["A"], cbetaselector)
 cbetas_B = collectatoms(struc["B"], cbetaselector)
@@ -15,7 +15,7 @@
 
 # Write the contact map to file
 using DelimitedFiles
-writedlm("contacts.out", Int64.(cm.data), " ")
source
BioStructures.DistanceMapType
DistanceMap(element)
 DistanceMap(element_one, element_two)
 DistanceMap(float_array_2D)

Calculate the distance map for a StructuralElementOrList, or between two StructuralElementOrLists.

This returns a DistanceMap type containing a Array{Float64, 2} with minimum distances between the sub-elements. When one element is given as input this returns a symmetric square matrix. To directly access the underlying data of DistanceMap dm, use dm.data.

Examples

cbetas_A = collectatoms(struc["A"], cbetaselector)
 cbetas_B = collectatoms(struc["B"], cbetaselector)
@@ -31,55 +31,55 @@
 
 # Write the distance map to file
 using DelimitedFiles
-writedlm("distances.out", dm.data, " ")
source
BioStructures.MMCIFDictType
MMCIFDict(filepath; gzip=false)
 MMCIFDict(io; gzip=false)
-MMCIFDict()

A macromolecular Crystallographic Information File (mmCIF) dictionary.

Can be accessed using similar functions to a standard Dict. Keys are field names as a String and values are always Vector{String}, even for multiple components or numerical data. To directly access the underlying dictionary of MMCIFDict d, use d.dict. Call MMCIFDict with a filepath or stream to read the dictionary from that source. The keyword argument gzip (default false) determines if the input is gzipped.

source
BioStructures.MMTFDictType
MMTFDict(filepath; gzip=false)
+MMCIFDict()

A macromolecular Crystallographic Information File (mmCIF) dictionary.

Can be accessed using similar functions to a standard Dict. Keys are field names as a String and values are always Vector{String}, even for multiple components or numerical data. To directly access the underlying dictionary of MMCIFDict d, use d.dict. Call MMCIFDict with a filepath or stream to read the dictionary from that source. The keyword argument gzip (default false) determines if the input is gzipped.

source
BioStructures.MMTFDictType
MMTFDict(filepath; gzip=false)
 MMTFDict(io; gzip=false)
-MMTFDict()

A Macromolecular Transmission Format (MMTF) dictionary.

Can be accessed using similar functions to a standard Dict. Keys are field names as a String and values are various types. To directly access the underlying dictionary of MMTFDict d, use d.dict. Call MMTFDict with a filepath or stream to read the dictionary from that source. The keyword argument gzip (default false) determines if the file is gzipped.

source
BioStructures.TransformationType
Transformation(el1, el2, residue_selectors...)
+MMTFDict()

A Macromolecular Transmission Format (MMTF) dictionary.

Can be accessed using similar functions to a standard Dict. Keys are field names as a String and values are various types. To directly access the underlying dictionary of MMTFDict d, use d.dict. Call MMTFDict with a filepath or stream to read the dictionary from that source. The keyword argument gzip (default false) determines if the file is gzipped.

source
BioStructures.TransformationType
Transformation(el1, el2, residue_selectors...)
 Transformation(coords1, coords2)
-Transformation(trans1, trans2, rot)

A 3D transformation to map one set of coordinates onto another, found using the Kabsch algorithm.

When called with structural elements, carries out a pairwise alignment and superimposes on atoms from aligned residues. In this case, keyword arguments for pairwise alignment can be given, see pairalign. The residue selectors determine which residues to do the pairwise alignment on. The keyword argument alignatoms is an atom selector that selects the atoms to calculate the superimposition on (default calphaselector). Can also be called with two sets of coordinates of the same size, with the number of dimensions in the first axis and the number of points in the second axis.

The returned Transformation object consists of the mean coordinates of the first set, the mean coordinates of the second set, the rotation to map the first centred set onto the second centred set, and the indices of the aligned residues in the first and second elements if relevant.

source
Base.collectMethod
collect(el)

Returns a Vector of the sub-elements in a StructuralElementOrList, e.g. AbstractAtoms in a Residue or AbstractResidues in a Chain.

source
Base.readMethod
read(filepath::AbstractString, format::Type; <keyword arguments>)
-read(input::IO, format::Type; <keyword arguments>)

Read a Protein Data Bank (PDB) file and return a ProteinStructure.

Arguments

  • format::Type: the format of the PDB file; options are PDB, MMCIF and MMTF.
  • structure_name::AbstractString: the name given to the returned ProteinStructure; defaults to the file name.
  • remove_disorder::Bool=false: whether to remove atoms with alt loc ID not ' ' or 'A'.
  • read_std_atoms::Bool=true: whether to read standard ATOM records.
  • read_het_atoms::Bool=true: whether to read HETATOM records.
  • run_dssp::Bool=false: whether to run DSSP to assign secondary structure.
  • run_stride::Bool=false: whether to run STRIDE to assign secondary structure.
  • gzip::Bool=false: whether the input is gzipped, not available for PDB format.
source
BioAlignments.pairalignMethod
pairalign(el1, el2, residue_selectors...)

Carries out a pairwise sequence alignment between the sequences of two structural elements.

Additional arguments are residue selector functions - only residues that return true from all the functions are retained. The keyword arguments scoremodel (default AffineGapScoreModel(BLOSUM62, gap_open=-10, gap_extend=-1)) and aligntype (default GlobalAlignment()) determine the properties of the alignment. See BioAlignments.jl for more on how to use alignments.

source
BioGenerics.distanceMethod
distance(element_one, element_two, atom_selectors...)

Get the minimum distance in Å between two StructuralElementOrLists.

Additional arguments are atom selector functions - only atoms that return true from the functions are retained.

source
BioStructures.allselectorMethod
allselector(at)
-allselector(res)

Trivial selector that returns true for any AbstractAtom or AbstractResidue. Use it to select all atoms or residues.

source
BioStructures.altlocidsMethod
altlocids(dis_at)

Get the list of alternative location IDs in an AbstractAtom as a Vector{Char}, sorted by atom serial.

source
BioStructures.applyselectors!Method
applyselectors!(els, selectors...)

Removes from a Vector of StructuralElements all elements that do not return true from all the selector functions.

source
BioStructures.applyselectorsMethod
applyselectors(els, selectors...)

Returns a copy of a Vector of StructuralElements with all elements that do not return true from all the selector functions removed.

source
BioStructures.atomidMethod
atomid(at)

Get a descriptive atom ID for an AbstractAtom as a Tuple of the form (full residue ID, residue name, atom name).

source
BioStructures.atomnameMethod
atomname(at; strip=true)

Get the atom name of an AbstractAtom as a String. strip determines whether surrounding whitespace is stripped.

source
BioStructures.atomnamesMethod
atomnames(res; strip=true)

Get the sorted list of AbstractAtoms in an AbstractResidue. strip determines whether surrounding whitespace is stripped.

source
BioStructures.atomnameselectorMethod
atomnameselector(at, atom_names; strip=true)

Determines if an AbstractAtom has its atom name in a list of names. strip determines whether surrounding whitespace is stripped from the atom name before it is checked in the list.

source
BioStructures.bondangleMethod
bondangle(atom_a, atom_b, atom_c)
-bondangle(vec_ba, vec_bc)

Calculate the bond or pseudo-bond angle in radians between three AbstractAtoms or two vectors.

The angle between B→A and B→C is returned in the range 0 to π.

source
BioStructures.cbetaselectorMethod
cbetaselector(at)

Determines if an AbstractAtom is not a hetero-atom and corresponds to a Cβ atom, or a Cα atom in glycine.

source
BioStructures.chainMethod
chain(at)
-chain(res)

Return the Chain that an AbstractAtom or AbstractResidue belongs to.

source
BioStructures.chainid!Method
chainid!(ch, id)
-chainid!(res, id)

Set the chain ID of a Chain or an AbstractResidue to a new String.

If a chain with this ID already exists, it will be removed from its current chain and added to that chain. If a chain with this ID does not exist, a new chain will be added to the model and this residue will be added to it. If moving this residue from a chain to a new chain leaves the old chain without residues, the old chain will be removed from the Model.

source
BioStructures.chainidsMethod
chainids(mod)
-chainids(struc)

Get the sorted chain IDs of the chains in a Model, or the default Model of a ProteinStructure, as a Vector{String}.

source
BioStructures.chainsMethod
chains(mod)
-chains(struc)

Return the dictionary of Chains in a Model, or the default Model of a ProteinStructure.

source
BioStructures.chargeMethod
charge(at; strip=true)

Get the charge on an AbstractAtom as a String. The charge is set to " " if not specified during atom creation. strip determines whether surrounding whitespace is stripped.

source
BioStructures.choosedefaultaltlocidMethod
choosedefaultaltlocid(at_one, at_two)

Determine which of two Atoms representing a disorered atom better qualifies as the default location.

The Atom with the highest occupancy is chosen; in the case of ties the Atom with the lowest alternative location ID in alphabetical order is chosen.

source
BioStructures.coilselectorMethod
coilselector(res)
-coilselector(at)

Determines if an AbstractResidue or AbstractAtom is part of a coil, i.e. whether the secondary structure code is in coilsscodes.

source
BioStructures.collectatomsMethod
collectatoms(el)

Returns a Vector of the atoms in a StructuralElementOrList.

Additional arguments are atom selector functions - only atoms that return true from all the functions are retained. The keyword argument expand_disordered (default false) determines whether to return all copies of disordered atoms separately.

source
BioStructures.collectchainsMethod
collectchains(el)

Returns a Vector of the chains in a StructuralElementOrList.

Additional arguments are chain selector functions - only chains that return true from all the functions are retained.

source
BioStructures.collectmodelsMethod
collectmodels(el)

Returns a Vector of the models in a StructuralElementOrList.

Additional arguments are model selector functions - only models that return true from all the functions are retained.

source
BioStructures.collectresiduesMethod
collectresidues(el)

Returns a Vector of the residues in a StructuralElementOrList.

Additional arguments are residue selector functions - only residues that return true from all the functions are retained. The keyword argument expand_disordered (default false) determines whether to return all copies of disordered residues separately.

source
BioStructures.coordarrayMethod
coordarray(element, atom_selectors...)

Get the atomic coordinates in Å of a StructuralElementOrList as a 2D Array.

Each column corresponds to one atom, so the size is (3, natoms). Additional arguments are atom selector functions - only atoms that return true from all the functions are retained. The keyword argument `expanddisordered(defaultfalse`) determines whether to return coordinates for all copies of disordered atoms separately.

source
BioStructures.coords!Method
coords!(at, new_coords)

Set the coordinates of an AbstractAtom to a Vector of 3 numbers.

For DisorderedAtoms only the default atom is updated.

source
BioStructures.countatomsMethod
countatoms(el)

Return the number of atoms in a StructuralElementOrList as an Int.

Additional arguments are atom selector functions - only atoms that return true from all the functions are counted. The keyword argument expand_disordered (default false) determines whether to return all copies of disordered atoms separately.

source
BioStructures.countchainsMethod
countchains(el)

Return the number of Chains in a StructuralElementOrList as an Int.

Additional arguments are chain selector functions - only chains that return true from all the functions are counted.

source
BioStructures.countmodelsMethod
countmodels(el)

Return the number of Models in a StructuralElementOrList as an Int.

Additional arguments are model selector functions - only models that return true from all the functions are counted.

source
BioStructures.countresiduesMethod
countresidues(el)

Return the number of residues in a StructuralElementOrList as an Int.

Additional arguments are residue selector functions - only residues that return true from all the functions are counted. The keyword argument expand_disordered (default false) determines whether to return all copies of disordered residues separately.

source
BioStructures.defaultaltlocidMethod
defaultaltlocid(dis_at)

Get the alternative location ID of the default Atom in a DisorderedAtom as a Char.

The default is the highest occupancy, or lowest character alternative location ID for ties (i.e. 'A' beats 'B').

source
BioStructures.defaultatomMethod
defaultatom(dis_at)

Return the default Atom in a DisorderedAtom.

The default is the highest occupancy, or lowest character alternative location ID for ties (i.e. 'A' beats 'B').

source
BioStructures.defaultresnameMethod
defaultresname(dis_res)

Get the name of the default Residue in a DisorderedResidue as a String.

The default is the first name read in.

source
BioStructures.dihedralangleMethod
dihedralangle(atom_a, atom_b, atom_c, atom_d)
-dihedralangle(vec_ab, vec_bc, vec_cd)

Calculate the dihedral angle in radians defined by four AbstractAtoms or three vectors.

The angle between the planes defined by atoms (A, B, C) and (B, C, D) is returned in the range -π to π.

source
BioStructures.disorderselectorMethod
disorderselector(at)
-disorderselector(res)

Determines whether an AbstractAtom or AbstractResidue is disordered, i.e. has multiple locations in the case of atoms or multiple residue names (point mutants) in the case of residues.

source
BioStructures.displacementsMethod
displacements(element_one, element_two, residue_selectors...)
+Transformation(trans1, trans2, rot)

A 3D transformation to map one set of coordinates onto another, found using the Kabsch algorithm.

When called with structural elements, carries out a pairwise alignment and superimposes on atoms from aligned residues. In this case, keyword arguments for pairwise alignment can be given, see pairalign. The residue selectors determine which residues to do the pairwise alignment on. The keyword argument alignatoms is an atom selector that selects the atoms to calculate the superimposition on (default calphaselector). Can also be called with two sets of coordinates of the same size, with the number of dimensions in the first axis and the number of points in the second axis.

The returned Transformation object consists of the mean coordinates of the first set, the mean coordinates of the second set, the rotation to map the first centred set onto the second centred set, and the indices of the aligned residues in the first and second elements if relevant.

source
Base.collectMethod
collect(el)

Returns a Vector of the sub-elements in a StructuralElementOrList, e.g. AbstractAtoms in a Residue or AbstractResidues in a Chain.

source
Base.readMethod
read(filepath::AbstractString, format::Type; <keyword arguments>)
+read(input::IO, format::Type; <keyword arguments>)

Read a Protein Data Bank (PDB) file and return a ProteinStructure.

Arguments

  • format::Type: the format of the PDB file; options are PDB, MMCIF and MMTF.
  • structure_name::AbstractString: the name given to the returned ProteinStructure; defaults to the file name.
  • remove_disorder::Bool=false: whether to remove atoms with alt loc ID not ' ' or 'A'.
  • read_std_atoms::Bool=true: whether to read standard ATOM records.
  • read_het_atoms::Bool=true: whether to read HETATOM records.
  • run_dssp::Bool=false: whether to run DSSP to assign secondary structure.
  • run_stride::Bool=false: whether to run STRIDE to assign secondary structure.
  • gzip::Bool=false: whether the input is gzipped, not available for PDB format.
source
BioAlignments.pairalignMethod
pairalign(el1, el2, residue_selectors...)

Carries out a pairwise sequence alignment between the sequences of two structural elements.

Additional arguments are residue selector functions - only residues that return true from all the functions are retained. The keyword arguments scoremodel (default AffineGapScoreModel(BLOSUM62, gap_open=-10, gap_extend=-1)) and aligntype (default GlobalAlignment()) determine the properties of the alignment. See BioAlignments.jl for more on how to use alignments.

source
BioGenerics.distanceMethod
distance(element_one, element_two, atom_selectors...)

Get the minimum distance in Å between two StructuralElementOrLists.

Additional arguments are atom selector functions - only atoms that return true from the functions are retained.

source
BioStructures.allselectorMethod
allselector(at)
+allselector(res)

Trivial selector that returns true for any AbstractAtom or AbstractResidue. Use it to select all atoms or residues.

source
BioStructures.altlocidsMethod
altlocids(dis_at)

Get the list of alternative location IDs in an AbstractAtom as a Vector{Char}, sorted by atom serial.

source
BioStructures.applyselectors!Method
applyselectors!(els, selectors...)

Removes from a Vector of StructuralElements all elements that do not return true from all the selector functions.

source
BioStructures.applyselectorsMethod
applyselectors(els, selectors...)

Returns a copy of a Vector of StructuralElements with all elements that do not return true from all the selector functions removed.

source
BioStructures.atomidMethod
atomid(at)

Get a descriptive atom ID for an AbstractAtom as a Tuple of the form (full residue ID, residue name, atom name).

source
BioStructures.atomnameMethod
atomname(at; strip=true)

Get the atom name of an AbstractAtom as a String. strip determines whether surrounding whitespace is stripped.

source
BioStructures.atomnamesMethod
atomnames(res; strip=true)

Get the sorted list of AbstractAtoms in an AbstractResidue. strip determines whether surrounding whitespace is stripped.

source
BioStructures.atomnameselectorMethod
atomnameselector(at, atom_names; strip=true)

Determines if an AbstractAtom has its atom name in a list of names. strip determines whether surrounding whitespace is stripped from the atom name before it is checked in the list.

source
BioStructures.bondangleMethod
bondangle(atom_a, atom_b, atom_c)
+bondangle(vec_ba, vec_bc)

Calculate the bond or pseudo-bond angle in radians between three AbstractAtoms or two vectors.

The angle between B→A and B→C is returned in the range 0 to π.

source
BioStructures.cbetaselectorMethod
cbetaselector(at)

Determines if an AbstractAtom is not a hetero-atom and corresponds to a Cβ atom, or a Cα atom in glycine.

source
BioStructures.chainMethod
chain(at)
+chain(res)

Return the Chain that an AbstractAtom or AbstractResidue belongs to.

source
BioStructures.chainid!Method
chainid!(ch, id)
+chainid!(res, id)

Set the chain ID of a Chain or an AbstractResidue to a new String.

If a chain with this ID already exists, it will be removed from its current chain and added to that chain. If a chain with this ID does not exist, a new chain will be added to the model and this residue will be added to it. If moving this residue from a chain to a new chain leaves the old chain without residues, the old chain will be removed from the Model.

source
BioStructures.chainidsMethod
chainids(mod)
+chainids(struc)

Get the sorted chain IDs of the chains in a Model, or the default Model of a ProteinStructure, as a Vector{String}.

source
BioStructures.chainsMethod
chains(mod)
+chains(struc)

Return the dictionary of Chains in a Model, or the default Model of a ProteinStructure.

source
BioStructures.chargeMethod
charge(at; strip=true)

Get the charge on an AbstractAtom as a String. The charge is set to " " if not specified during atom creation. strip determines whether surrounding whitespace is stripped.

source
BioStructures.choosedefaultaltlocidMethod
choosedefaultaltlocid(at_one, at_two)

Determine which of two Atoms representing a disorered atom better qualifies as the default location.

The Atom with the highest occupancy is chosen; in the case of ties the Atom with the lowest alternative location ID in alphabetical order is chosen.

source
BioStructures.coilselectorMethod
coilselector(res)
+coilselector(at)

Determines if an AbstractResidue or AbstractAtom is part of a coil, i.e. whether the secondary structure code is in coilsscodes.

source
BioStructures.collectatomsMethod
collectatoms(el)

Returns a Vector of the atoms in a StructuralElementOrList.

Additional arguments are atom selector functions - only atoms that return true from all the functions are retained. The keyword argument expand_disordered (default false) determines whether to return all copies of disordered atoms separately.

source
BioStructures.collectchainsMethod
collectchains(el)

Returns a Vector of the chains in a StructuralElementOrList.

Additional arguments are chain selector functions - only chains that return true from all the functions are retained.

source
BioStructures.collectmodelsMethod
collectmodels(el)

Returns a Vector of the models in a StructuralElementOrList.

Additional arguments are model selector functions - only models that return true from all the functions are retained.

source
BioStructures.collectresiduesMethod
collectresidues(el)

Returns a Vector of the residues in a StructuralElementOrList.

Additional arguments are residue selector functions - only residues that return true from all the functions are retained. The keyword argument expand_disordered (default false) determines whether to return all copies of disordered residues separately.

source
BioStructures.coordarrayMethod
coordarray(element, atom_selectors...)

Get the atomic coordinates in Å of a StructuralElementOrList as a 2D Array.

Each column corresponds to one atom, so the size is (3, natoms). Additional arguments are atom selector functions - only atoms that return true from all the functions are retained. The keyword argument `expanddisordered(defaultfalse`) determines whether to return coordinates for all copies of disordered atoms separately.

source
BioStructures.coords!Method
coords!(at, new_coords)

Set the coordinates of an AbstractAtom to a Vector of 3 numbers.

For DisorderedAtoms only the default atom is updated.

source
BioStructures.countatomsMethod
countatoms(el)

Return the number of atoms in a StructuralElementOrList as an Int.

Additional arguments are atom selector functions - only atoms that return true from all the functions are counted. The keyword argument expand_disordered (default false) determines whether to return all copies of disordered atoms separately.

source
BioStructures.countchainsMethod
countchains(el)

Return the number of Chains in a StructuralElementOrList as an Int.

Additional arguments are chain selector functions - only chains that return true from all the functions are counted.

source
BioStructures.countmodelsMethod
countmodels(el)

Return the number of Models in a StructuralElementOrList as an Int.

Additional arguments are model selector functions - only models that return true from all the functions are counted.

source
BioStructures.countresiduesMethod
countresidues(el)

Return the number of residues in a StructuralElementOrList as an Int.

Additional arguments are residue selector functions - only residues that return true from all the functions are counted. The keyword argument expand_disordered (default false) determines whether to return all copies of disordered residues separately.

source
BioStructures.defaultaltlocidMethod
defaultaltlocid(dis_at)

Get the alternative location ID of the default Atom in a DisorderedAtom as a Char.

The default is the highest occupancy, or lowest character alternative location ID for ties (i.e. 'A' beats 'B').

source
BioStructures.defaultatomMethod
defaultatom(dis_at)

Return the default Atom in a DisorderedAtom.

The default is the highest occupancy, or lowest character alternative location ID for ties (i.e. 'A' beats 'B').

source
BioStructures.defaultresnameMethod
defaultresname(dis_res)

Get the name of the default Residue in a DisorderedResidue as a String.

The default is the first name read in.

source
BioStructures.dihedralangleMethod
dihedralangle(atom_a, atom_b, atom_c, atom_d)
+dihedralangle(vec_ab, vec_bc, vec_cd)

Calculate the dihedral angle in radians defined by four AbstractAtoms or three vectors.

The angle between the planes defined by atoms (A, B, C) and (B, C, D) is returned in the range -π to π.

source
BioStructures.disorderselectorMethod
disorderselector(at)
+disorderselector(res)

Determines whether an AbstractAtom or AbstractResidue is disordered, i.e. has multiple locations in the case of atoms or multiple residue names (point mutants) in the case of residues.

source
BioStructures.displacementsMethod
displacements(element_one, element_two, residue_selectors...)
 displacements(element_one, element_two, superimpose=false)
-displacements(coords_one, coords_two)

Get the displacements in Å between atomic coordinates from two StructuralElementOrLists or two coordinate Arrays.

If superimpose is true (the default), the elements are superimposed before calculation and the displacements are calculated on the superimposed residues. See Transformation for keyword arguments. If superimpose is false the elements are assumed to be superimposed and must be of the same length. The keyword argument dispatoms is an atom selector that selects the atoms to calculate displacements on (default calphaselector).

source
BioStructures.downloadallobsoletepdbMethod
downloadallobsoletepdb(; <keyword arguments>)

Download all obsolete Protein Data Bank (PDB) files from the RCSB server.

Returns the list of PDB IDs downloaded. Requires an internet connection.

Arguments

  • obsolete_dir::AbstractString=pwd(): the directory where the PDB files are downloaded; defaults to the current working directory.
  • format::Type=PDB: the format of the PDB file; options are PDB, PDBXML, MMCIF and MMTF.
  • overwrite::Bool=false: if set true, overwrites the PDB file if it exists in dir; by default skips downloading the PDB file if it exists.
source
BioStructures.downloadentirepdbMethod
downloadentirepdb(; <keyword arguments>)

Download the entire Protein Data Bank (PDB) from the RCSB server.

Returns the list of PDB IDs downloaded. Ensure you have enough disk space and time before running. The function can be stopped any time and called again to resume downloading. Requires an internet connection.

Arguments

  • dir::AbstractString=pwd(): the directory to which the PDB files are downloaded; defaults to the current working directory.
  • format::Type=PDB: the format of the PDB file; options are PDB, PDBXML, MMCIF and MMTF.
  • overwrite::Bool=false: if set true, overwrites the PDB file if it exists in dir; by default skips downloading the PDB file if it exists.
source
BioStructures.downloadpdbMethod
downloadpdb(pdbid::AbstractString; <keyword arguments>)
+displacements(coords_one, coords_two)

Get the displacements in Å between atomic coordinates from two StructuralElementOrLists or two coordinate Arrays.

If superimpose is true (the default), the elements are superimposed before calculation and the displacements are calculated on the superimposed residues. See Transformation for keyword arguments. If superimpose is false the elements are assumed to be superimposed and must be of the same length. The keyword argument dispatoms is an atom selector that selects the atoms to calculate displacements on (default calphaselector).

source
BioStructures.downloadallobsoletepdbMethod
downloadallobsoletepdb(; <keyword arguments>)

Download all obsolete Protein Data Bank (PDB) files from the RCSB server.

Returns the list of PDB IDs downloaded. Requires an internet connection.

Arguments

  • obsolete_dir::AbstractString=pwd(): the directory where the PDB files are downloaded; defaults to the current working directory.
  • format::Type=PDB: the format of the PDB file; options are PDB, PDBXML, MMCIF and MMTF.
  • overwrite::Bool=false: if set true, overwrites the PDB file if it exists in dir; by default skips downloading the PDB file if it exists.
source
BioStructures.downloadentirepdbMethod
downloadentirepdb(; <keyword arguments>)

Download the entire Protein Data Bank (PDB) from the RCSB server.

Returns the list of PDB IDs downloaded. Ensure you have enough disk space and time before running. The function can be stopped any time and called again to resume downloading. Requires an internet connection.

Arguments

  • dir::AbstractString=pwd(): the directory to which the PDB files are downloaded; defaults to the current working directory.
  • format::Type=PDB: the format of the PDB file; options are PDB, PDBXML, MMCIF and MMTF.
  • overwrite::Bool=false: if set true, overwrites the PDB file if it exists in dir; by default skips downloading the PDB file if it exists.
source
BioStructures.downloadpdbMethod
downloadpdb(pdbid::AbstractString; <keyword arguments>)
 downloadpdb(pdbid::Array{<:AbstractString, 1}; <keyword arguments>)
-downloadpdb(f::Function, args...)

Download files from the Protein Data Bank (PDB) via RCSB.

When given an AbstractString, e.g. "1AKE", downloads the PDB file and returns the path to the file. When given an Array{<:AbstractString, 1}, downloads the PDB files in the array and returns an array of the paths to the files. When given a function as the first argument, runs the function with the downloaded filepath(s) as an argument then removes the file(s). Requires an internet connection.

Arguments

  • dir::AbstractString=pwd(): the directory to which the PDB file is downloaded; defaults to the current working directory.
  • format::Type=PDB: the format of the PDB file; options are PDB, PDBXML, MMCIF and MMTF.
  • obsolete::Bool=false: if set true, the PDB file is downloaded in the auto-generated "obsolete" directory inside the specified dir.
  • overwrite::Bool=false: if set true, overwrites the PDB file if it exists in dir; by default skips downloading the PDB file if it exists.
  • ba_number::Integer=0: if set > 0 downloads the respective biological assembly; by default downloads the PDB file.
source
BioStructures.elementMethod
element(at; strip=true)

Get the element of an AbstractAtom as a String.

The element is set to " " if not specified during atom creation. strip determines whether surrounding whitespace is stripped.

source
BioStructures.generatechainidMethod
generatechainid(entity_id)

Convert a positive Integer into a chain ID.

Goes A to Z, then AA to ZA, AB to ZB etc. This is in line with Protein Data Bank (PDB) conventions.

source
BioStructures.helixselectorMethod
helixselector(res)
-helixselector(at)

Determines if an AbstractResidue or AbstractAtom is part of an α-helix, i.e. whether the secondary structure code is in helixsscodes.

source
BioStructures.heteroselectorMethod
heteroselector(at)
-heteroselector(res)

Determines if an AbstractAtom represents a hetero atom, e.g. came from a HETATM record in a Protein Data Bank (PDB) file, or if an AbstractResidue represents a hetero molecule, e.g. consists of HETATM records from a PDB file.

source
BioStructures.hydrogenselectorMethod
hydrogenselector(at)

Determines if an AbstractAtom represents hydrogen.

Uses the element field where possible, otherwise uses the atom name.

source
BioStructures.isdisorderedresMethod
isdisorderedres(res)

Determine if an AbstractResidue is a DisorderedResidue, i.e. there are multiple residue names with the same residue ID.

source
BioStructures.isheteroMethod
ishetero(at)
-ishetero(res)

Determines if an AbstractAtom represents a hetero atom, e.g. came from a HETATM record in a Protein Data Bank (PDB) file, or if an AbstractResidue represents a hetero molecule, e.g. consists of HETATM records from a PDB file.

source
BioStructures.notwaterselectorMethod
notwaterselector(res)
-notwaterselector(at)

Determines if an AbstractResidue or AbstractAtom does not represent a water molecule, i.e. whether the residue name is not in waterresnames.

source
BioStructures.occupancyMethod
occupancy(at)

Get the occupancy of an AbstractAtom as a Float64.

The occupancy is set to 1.0 if not specified during atom creation.

source
BioStructures.omegaangleMethod
omegaangle(res, res_previous)
-omegaangle(chain, res_id)

Calculate the omega angle in radians for an AbstractResidue.

Arguments can either be a residue and the previous residue or a chain and the position as a residue ID. The first residue (or one at the given index) requires the atoms "N" and "CA" and the previous residue requires the atoms "CA" and "C". The angle is in the range -π to π.

source
BioStructures.omegaanglesMethod
omegaangles(element, residue_selectors...)

Calculate the Vector of omega angles of a StructuralElementOrList.

The vectors have NaN for residues where an angle cannot be calculated, e.g. due to missing atoms or lack of an adjacent residue. The angle is in the range -π to π. Additional arguments are residue selector functions - only residues that return true from the functions are retained.

source
BioStructures.pdbentrylistMethod
pdbentrylist()

Obtain the list of all Protein Data Bank (PDB) entries from the RCSB server.

Requires an internet connection.

source
BioStructures.pdblineMethod
pdbline(at::Atom)
+downloadpdb(f::Function, args...)

Download files from the Protein Data Bank (PDB) via RCSB.

When given an AbstractString, e.g. "1AKE", downloads the PDB file and returns the path to the file. When given an Array{<:AbstractString, 1}, downloads the PDB files in the array and returns an array of the paths to the files. When given a function as the first argument, runs the function with the downloaded filepath(s) as an argument then removes the file(s). Requires an internet connection.

Arguments

  • dir::AbstractString=pwd(): the directory to which the PDB file is downloaded; defaults to the current working directory.
  • format::Type=PDB: the format of the PDB file; options are PDB, PDBXML, MMCIF and MMTF.
  • obsolete::Bool=false: if set true, the PDB file is downloaded in the auto-generated "obsolete" directory inside the specified dir.
  • overwrite::Bool=false: if set true, overwrites the PDB file if it exists in dir; by default skips downloading the PDB file if it exists.
  • ba_number::Integer=0: if set > 0 downloads the respective biological assembly; by default downloads the PDB file.
source
BioStructures.elementMethod
element(at; strip=true)

Get the element of an AbstractAtom as a String.

The element is set to " " if not specified during atom creation. strip determines whether surrounding whitespace is stripped.

source
BioStructures.generatechainidMethod
generatechainid(entity_id)

Convert a positive Integer into a chain ID.

Goes A to Z, then AA to ZA, AB to ZB etc. This is in line with Protein Data Bank (PDB) conventions.

source
BioStructures.helixselectorMethod
helixselector(res)
+helixselector(at)

Determines if an AbstractResidue or AbstractAtom is part of an α-helix, i.e. whether the secondary structure code is in helixsscodes.

source
BioStructures.heteroselectorMethod
heteroselector(at)
+heteroselector(res)

Determines if an AbstractAtom represents a hetero atom, e.g. came from a HETATM record in a Protein Data Bank (PDB) file, or if an AbstractResidue represents a hetero molecule, e.g. consists of HETATM records from a PDB file.

source
BioStructures.hydrogenselectorMethod
hydrogenselector(at)

Determines if an AbstractAtom represents hydrogen.

Uses the element field where possible, otherwise uses the atom name.

source
BioStructures.isdisorderedresMethod
isdisorderedres(res)

Determine if an AbstractResidue is a DisorderedResidue, i.e. there are multiple residue names with the same residue ID.

source
BioStructures.isheteroMethod
ishetero(at)
+ishetero(res)

Determines if an AbstractAtom represents a hetero atom, e.g. came from a HETATM record in a Protein Data Bank (PDB) file, or if an AbstractResidue represents a hetero molecule, e.g. consists of HETATM records from a PDB file.

source
BioStructures.notwaterselectorMethod
notwaterselector(res)
+notwaterselector(at)

Determines if an AbstractResidue or AbstractAtom does not represent a water molecule, i.e. whether the residue name is not in waterresnames.

source
BioStructures.occupancyMethod
occupancy(at)

Get the occupancy of an AbstractAtom as a Float64.

The occupancy is set to 1.0 if not specified during atom creation.

source
BioStructures.omegaangleMethod
omegaangle(res, res_previous)
+omegaangle(chain, res_id)

Calculate the omega angle in radians for an AbstractResidue.

Arguments can either be a residue and the previous residue or a chain and the position as a residue ID. The first residue (or one at the given index) requires the atoms "N" and "CA" and the previous residue requires the atoms "CA" and "C". The angle is in the range -π to π.

source
BioStructures.omegaanglesMethod
omegaangles(element, residue_selectors...)

Calculate the Vector of omega angles of a StructuralElementOrList.

The vectors have NaN for residues where an angle cannot be calculated, e.g. due to missing atoms or lack of an adjacent residue. The angle is in the range -π to π. Additional arguments are residue selector functions - only residues that return true from the functions are retained.

source
BioStructures.pdbentrylistMethod
pdbentrylist()

Obtain the list of all Protein Data Bank (PDB) entries from the RCSB server.

Requires an internet connection.

source
BioStructures.pdblineMethod
pdbline(at::Atom)
 pdbline(at::DisorderedAtom)
-pdbline(at::AtomRecord)

Form a Protein Data Bank (PDB) format ATOM or HETATM record as a String from an Atom, DisorderedAtom or AtomRecord.

This will throw an ArgumentError if a value cannot fit into the allocated space, e.g. the chain ID is longer than one character or the atom serial is greater than 99999. In this case consider using writemmcif or writemmtf to write a mmCIF file or a MMTF file.

source
BioStructures.pdbobsoletelistMethod
pdbobsoletelist()

Obtain the list of all obsolete Protein Data Bank (PDB) entries from the RCSB server.

Requires an internet connection.

source
BioStructures.pdbrecentchangesMethod
pdbrecentchanges()

Obtain three lists giving the added, modified and obsolete Protein Data Bank (PDB) entries from the recent RCSB weekly status files.

Requires an internet connection.

source
BioStructures.pdbstatuslistMethod
pdbstatuslist(url::AbstractString)

Obtain the list of Protein Data Bank (PDB) entries from a RCSB weekly status file by specifying its URL.

An example URL is ftp://ftp.wwpdb.org/pub/pdb/data/status/latest/added.pdb. Requires an internet connection.

source
BioStructures.phiangleMethod
phiangle(res, res_previous)
-phiangle(chain, res_id)

Calculate the phi angle in radians for an AbstractResidue.

Arguments can either be a residue and the previous residue or a chain and the position as a residue ID. The first residue (or one at the given index) requires the atoms "N", "CA" and "C" and the previous residue requires the atom "C". The angle is in the range -π to π.

source
BioStructures.phianglesMethod
phiangles(element, residue_selectors...)

Calculate the Vector of phi angles of a StructuralElementOrList.

The vectors have NaN for residues where an angle cannot be calculated, e.g. due to missing atoms or lack of an adjacent residue. The angle is in the range -π to π. Additional arguments are residue selector functions - only residues that return true from the functions are retained.

source
BioStructures.psiangleMethod
psiangle(res, res_next)
-psiangle(chain, res_id)

Calculate the psi angle in radians for an AbstractResidue.

Arguments can either be a residue and the next residue or a chain and the position as a residue ID. The first residue (or one at the given index) requires the atoms "N", "CA" and "C" and the next residue requires the atom "N". The angle is in the range -π to π.

source
BioStructures.psianglesMethod
psiangles(element, residue_selectors...)

Calculate the Vector of psi angles of a StructuralElementOrList.

The vectors have NaN for residues where an angle cannot be calculated, e.g. due to missing atoms or lack of an adjacent residue. The angle is in the range -π to π. Additional arguments are residue selector functions - only residues that return true from the functions are retained.

source
BioStructures.ramachandrananglesMethod
ramachandranangles(element, residue_selectors...)

Calculate the Vectors of phi and psi angles of a StructuralElementOrList.

The vectors have NaN for residues where an angle cannot be calculated, e.g. due to missing atoms or lack of an adjacent residue. The angles are in the range -π to π. Additional arguments are residue selector functions - only residues that return true from the functions are retained.

source
BioStructures.readmultimmcifMethod
readmultimmcif(filepath; gzip=false)
-readmultimmcif(io; gzip=false)

Read multiple MMCIFDicts from a filepath or stream. Each MMCIFDict in the returned Dict{String, MMCIFDict} corresponds to an mmCIF data block from the input. An example of such a file is the chemical component dictionary from the Protein Data Bank. The keyword argument gzip (default false) determines if the input is gzipped.

source
BioStructures.residMethod
resid(res; full=true)

Get a descriptive residue ID String for an AbstractAtom or AbstractResidue.

Format is residue number then insertion code with "H" in front for hetero residues. If full equals true the chain ID is also added after a colon. Examples are "50A", "H20" and "10:A".

source
BioStructures.resnameMethod
resname(at; strip=true)
-resname(res; strip=true)

Get the residue name of an AbstractAtom or AbstractResidue as a String.

strip determines whether surrounding whitespace is stripped.

source
BioStructures.resnamesMethod
resnames(dis_res)

Get the residue names in an AbstractResidue as a Vector{String}.

For a DisorderedResidue there will be multiple residue names - in this case the default residue name is placed first, then the others are ordered alphabetically.

source
BioStructures.resnameselectorMethod
resnameselector(res, res_names)
-resnameselector(at, res_names)

Determines if an AbstractResidue or AbstractAtom has its residue name in a list of names.

source
BioStructures.retrievepdbMethod
retrievepdb(pdbid::AbstractString; <keyword arguments>)

Download and read a Protein Data Bank (PDB) file or biological assembly from the RCSB server, returning a ProteinStructure.

Requires an internet connection.

Arguments

  • pdbid::AbstractString: the PDB ID to be downloaded and read.
  • dir::AbstractString=pwd(): the directory to which the PDB file is downloaded; defaults to the current working directory.
  • obsolete::Bool=false: if set true, the PDB file is downloaded in the auto-generated "obsolete" directory inside the specified dir.
  • overwrite::Bool=false: if set true, overwrites the PDB file if it exists in dir; by default skips downloading the PDB file if it exists.
  • ba_number::Integer=0: if set > 0 downloads the respective biological assembly; by default downloads the PDB file.
  • structure_name::AbstractString="$pdbid.pdb": the name given to the returned ProteinStructure; defaults to the PDB ID.
  • remove_disorder::Bool=false: whether to remove atoms with alt loc ID not ' ' or 'A'.
  • read_std_atoms::Bool=true: whether to read standard ATOM records.
  • read_het_atoms::Bool=true: whether to read HETATOM records.
  • run_dssp::Bool=false: whether to run DSSP to assign secondary structure.
  • run_stride::Bool=false: whether to run STRIDE to assign secondary structure.
source
BioStructures.rmsdMethod
rmsd(element_one, element_two, residue_selectors...)
+pdbline(at::AtomRecord)

Form a Protein Data Bank (PDB) format ATOM or HETATM record as a String from an Atom, DisorderedAtom or AtomRecord.

This will throw an ArgumentError if a value cannot fit into the allocated space, e.g. the chain ID is longer than one character or the atom serial is greater than 99999. In this case consider using writemmcif or writemmtf to write a mmCIF file or a MMTF file.

source
BioStructures.pdbobsoletelistMethod
pdbobsoletelist()

Obtain the list of all obsolete Protein Data Bank (PDB) entries from the RCSB server.

Requires an internet connection.

source
BioStructures.pdbrecentchangesMethod
pdbrecentchanges()

Obtain three lists giving the added, modified and obsolete Protein Data Bank (PDB) entries from the recent RCSB weekly status files.

Requires an internet connection.

source
BioStructures.pdbstatuslistMethod
pdbstatuslist(url::AbstractString)

Obtain the list of Protein Data Bank (PDB) entries from a RCSB weekly status file by specifying its URL.

An example URL is ftp://ftp.wwpdb.org/pub/pdb/data/status/latest/added.pdb. Requires an internet connection.

source
BioStructures.phiangleMethod
phiangle(res, res_previous)
+phiangle(chain, res_id)

Calculate the phi angle in radians for an AbstractResidue.

Arguments can either be a residue and the previous residue or a chain and the position as a residue ID. The first residue (or one at the given index) requires the atoms "N", "CA" and "C" and the previous residue requires the atom "C". The angle is in the range -π to π.

source
BioStructures.phianglesMethod
phiangles(element, residue_selectors...)

Calculate the Vector of phi angles of a StructuralElementOrList.

The vectors have NaN for residues where an angle cannot be calculated, e.g. due to missing atoms or lack of an adjacent residue. The angle is in the range -π to π. Additional arguments are residue selector functions - only residues that return true from the functions are retained.

source
BioStructures.psiangleMethod
psiangle(res, res_next)
+psiangle(chain, res_id)

Calculate the psi angle in radians for an AbstractResidue.

Arguments can either be a residue and the next residue or a chain and the position as a residue ID. The first residue (or one at the given index) requires the atoms "N", "CA" and "C" and the next residue requires the atom "N". The angle is in the range -π to π.

source
BioStructures.psianglesMethod
psiangles(element, residue_selectors...)

Calculate the Vector of psi angles of a StructuralElementOrList.

The vectors have NaN for residues where an angle cannot be calculated, e.g. due to missing atoms or lack of an adjacent residue. The angle is in the range -π to π. Additional arguments are residue selector functions - only residues that return true from the functions are retained.

source
BioStructures.ramachandrananglesMethod
ramachandranangles(element, residue_selectors...)

Calculate the Vectors of phi and psi angles of a StructuralElementOrList.

The vectors have NaN for residues where an angle cannot be calculated, e.g. due to missing atoms or lack of an adjacent residue. The angles are in the range -π to π. Additional arguments are residue selector functions - only residues that return true from the functions are retained.

source
BioStructures.readmultimmcifMethod
readmultimmcif(filepath; gzip=false)
+readmultimmcif(io; gzip=false)

Read multiple MMCIFDicts from a filepath or stream. Each MMCIFDict in the returned Dict{String, MMCIFDict} corresponds to an mmCIF data block from the input. An example of such a file is the chemical component dictionary from the Protein Data Bank. The keyword argument gzip (default false) determines if the input is gzipped.

source
BioStructures.residMethod
resid(res; full=true)

Get a descriptive residue ID String for an AbstractAtom or AbstractResidue.

Format is residue number then insertion code with "H" in front for hetero residues. If full equals true the chain ID is also added after a colon. Examples are "50A", "H20" and "10:A".

source
BioStructures.resnameMethod
resname(at; strip=true)
+resname(res; strip=true)

Get the residue name of an AbstractAtom or AbstractResidue as a String.

strip determines whether surrounding whitespace is stripped.

source
BioStructures.resnamesMethod
resnames(dis_res)

Get the residue names in an AbstractResidue as a Vector{String}.

For a DisorderedResidue there will be multiple residue names - in this case the default residue name is placed first, then the others are ordered alphabetically.

source
BioStructures.resnameselectorMethod
resnameselector(res, res_names)
+resnameselector(at, res_names)

Determines if an AbstractResidue or AbstractAtom has its residue name in a list of names.

source
BioStructures.retrievepdbMethod
retrievepdb(pdbid::AbstractString; <keyword arguments>)

Download and read a Protein Data Bank (PDB) file or biological assembly from the RCSB server, returning a ProteinStructure.

Requires an internet connection.

Arguments

  • pdbid::AbstractString: the PDB ID to be downloaded and read.
  • dir::AbstractString=pwd(): the directory to which the PDB file is downloaded; defaults to the current working directory.
  • obsolete::Bool=false: if set true, the PDB file is downloaded in the auto-generated "obsolete" directory inside the specified dir.
  • overwrite::Bool=false: if set true, overwrites the PDB file if it exists in dir; by default skips downloading the PDB file if it exists.
  • ba_number::Integer=0: if set > 0 downloads the respective biological assembly; by default downloads the PDB file.
  • structure_name::AbstractString="$pdbid.pdb": the name given to the returned ProteinStructure; defaults to the PDB ID.
  • remove_disorder::Bool=false: whether to remove atoms with alt loc ID not ' ' or 'A'.
  • read_std_atoms::Bool=true: whether to read standard ATOM records.
  • read_het_atoms::Bool=true: whether to read HETATOM records.
  • run_dssp::Bool=false: whether to run DSSP to assign secondary structure.
  • run_stride::Bool=false: whether to run STRIDE to assign secondary structure.
source
BioStructures.rmsdMethod
rmsd(element_one, element_two, residue_selectors...)
 rmsd(element_one, element_two, superimpose=false)
-rmsd(coords_one, coords_two)

Get the root-mean-square deviation (RMSD) in Å between two StructuralElementOrLists or two coordinate Arrays.

If superimpose is true (the default), the elements are superimposed before RMSD calculation and the RMSD is calculated on the superimposed residues. See Transformation for keyword arguments. If superimpose is false the elements are assumed to be superimposed and must be of the same length. The keyword argument rmsdatoms is an atom selector that selects the atoms to calculate RMSD on (default calphaselector).

source
BioStructures.rundssp!Method
rundssp!(struc)
-rundssp!(mod)

Run DSSP (Define Secondary Structure of Proteins) on the given structural element to assign secondary structure.

A temporary PDB file is written, so this will fail if the structural element cannot be written to a PDB file, for example if there are two-letter chain IDs.

source
BioStructures.rundsspMethod
rundssp(struc)
+rmsd(coords_one, coords_two)

Get the root-mean-square deviation (RMSD) in Å between two StructuralElementOrLists or two coordinate Arrays.

If superimpose is true (the default), the elements are superimposed before RMSD calculation and the RMSD is calculated on the superimposed residues. See Transformation for keyword arguments. If superimpose is false the elements are assumed to be superimposed and must be of the same length. The keyword argument rmsdatoms is an atom selector that selects the atoms to calculate RMSD on (default calphaselector).

source
BioStructures.rundssp!Method
rundssp!(struc)
+rundssp!(mod)

Run DSSP (Define Secondary Structure of Proteins) on the given structural element to assign secondary structure.

A temporary PDB file is written, so this will fail if the structural element cannot be written to a PDB file, for example if there are two-letter chain IDs.

source
BioStructures.rundsspMethod
rundssp(struc)
 rundssp(mod)
-rundssp(filepath_in, dssp_filepath_out)

Return a copy of the structural element with DSSP (Define Secondary Structure of Proteins) run to assign secondary structure, or run DSSP directly on a PDB or mmCIF file.

source
BioStructures.runstride!Method
runstride!(struc)
-runstride!(mod)

Run STRIDE on the given structural element to assign secondary structure.

A temporary PDB file is written, so this will fail if the structural element cannot be written to a PDB file, for example if there are two-letter chain IDs.

source
BioStructures.runstrideMethod
runstride(struc)
+rundssp(filepath_in, dssp_filepath_out)

Return a copy of the structural element with DSSP (Define Secondary Structure of Proteins) run to assign secondary structure, or run DSSP directly on a PDB or mmCIF file.

source
BioStructures.runstride!Method
runstride!(struc)
+runstride!(mod)

Run STRIDE on the given structural element to assign secondary structure.

A temporary PDB file is written, so this will fail if the structural element cannot be written to a PDB file, for example if there are two-letter chain IDs.

source
BioStructures.runstrideMethod
runstride(struc)
 runstride(mod)
-runstride(filepath_in, stride_filepath_out)

Return a copy of the structural element with STRIDE run to assign secondary structure, or run STRIDE directly on a PDB file.

source
BioStructures.sequentialresiduesMethod
sequentialresidues(res_first, res_second)

Determine if the second residue follows the first in sequence.

For this to be true the residues need to have the same chain ID, both need to be standard/hetero residues and the residue number of the second needs to be one greater than that of the first (or the residue numbers the same and the insertion code of the second greater than the first).

source
BioStructures.sheetselectorMethod
sheetselector(res)
-sheetselector(at)

Determines if an AbstractResidue or AbstractAtom is part of a β-sheet, i.e. whether the secondary structure code is in sheetsscodes.

source
BioStructures.showcontactmapMethod
showcontactmap(contact_map)
-showcontactmap(io, contact_map)

Print a representation of a ContactMap to stdout, or a specified IO instance.

A fully plotted version can be obtained with plot(contact_map) but that requires Plots.jl; showcontactmap works without that dependency.

source
BioStructures.spaceatomnameMethod
spaceatomname(at::Atom)

Space an Atom name such that the last element letter (generally) appears in the second column.

If the element property of the Atom is set it is used to get the element, otherwise the name starts from the second column where possible. This function is generally not required as spacing is recorded when atom names are read in from a Protein Data Bank (PDB) file. However this spacing can be important, for example distinguising between Cα and calcium atoms.

source
BioStructures.sqdistanceMethod
sqdistance(element_one, element_two, atom_selectors...)

Get the minimum square distance in Å between two StructuralElementOrLists.

Additional arguments are atom selector functions - only atoms that return true from the functions are retained.

source
BioStructures.sscodeMethod
sscode(res)
-sscode(at)

Get the secondary structure code of an AbstractResidue or AbstractAtom as a Char.

'-' represents unassigned secondary structure. Secondary structure can be assigned using rundssp! or runstride!.

source
BioStructures.sscodeselectorMethod
sscodeselector(res, ss_codes)
-sscodeselector(at, ss_codes)

Determines if an AbstractResidue or AbstractAtom has its secondary structure code in a list of secondary structure codes.

source
BioStructures.standardselectorMethod
standardselector(at)
-standardselector(res)

Determines if an AbstractAtom represents a standard atom, e.g. came from a ATOM record in a Protein Data Bank (PDB) file, or if an AbstractResidue represents a standard molecule, e.g. consists of ATOM records from a PDB file.

source
BioStructures.structureMethod
structure(el)

Return the ProteinStructure that an AbstractAtom, AbstractResidue, Chain or Model belongs to.

source
BioStructures.superimpose!Method
superimpose!(el1, el2, residue_selectors...)

Calculate the Transformation that maps the first element onto the second, and modify all coordinates in the first element according to the transformation.

See Transformation for keyword arguments.

source
BioStructures.tempfactorMethod
tempfactor(at)

Get the temperature factor of an AbstractAtom as a Float64.

The temperature factor is set to 0.0 if not specified during atom creation.

source
BioStructures.updatelocalpdbMethod
updatelocalpdb(; dir::AbstractString=pwd(), format::Type=PDB)

Update a local copy of the Protein Data Bank (PDB).

Obtains the recent weekly lists of new, modified and obsolete PDB entries and automatically updates the PDB files of the given format inside the local dir directory. Requires an internet connection.

source
BioStructures.waterselectorMethod
waterselector(res)
-waterselector(at)

Determines if an AbstractResidue or AbstractAtom represents a water molecule, i.e. whether the residue name is in waterresnames.

source
BioStructures.writemmcifMethod
writemmcif(output, element, atom_selectors...; gzip=false)
-writemmcif(output, mmcif_dict; gzip=false)

Write a StructuralElementOrList or a MMCIFDict to a mmCIF format file or output stream.

Atom selector functions can be given as additional arguments - only atoms that return true from all the functions are retained. The keyword argument expand_disordered (default true) determines whether to return all copies of disordered residues and atoms. The keyword argument gzip (default false) determines if the output is gzipped.

source
BioStructures.writemultimmcifMethod
writemultimmcif(filepath, cifs; gzip=false)
-writemultimmcif(io, cifs; gzip=false)

Write multiple MMCIFDicts as a Dict{String, MMCIFDict} to a filepath or stream. The keyword argument gzip (default false) determines if the output is gzipped.

source
BioStructures.writepdbMethod
writepdb(output, element, atom_selectors...)

Write a StructuralElementOrList to a Protein Data Bank (PDB) format file or output stream.

Only ATOM, HETATM, MODEL and ENDMDL records are written - there is no header and there are no TER records. Atom selector functions can be given as additional arguments - only atoms that return true from all the functions are retained. The keyword argument expand_disordered (default true) determines whether to return all copies of disordered residues and atoms.

source
BioStructures.x!Method
x!(at, val)

Set the x coordinate of an AbstractAtom to val.

For DisorderedAtoms only the default atom is updated.

source
BioStructures.y!Method
y!(at, val)

Set the y coordinate of an AbstractAtom to val.

For DisorderedAtoms only the default atom is updated.

source
BioStructures.z!Method
z!(at, val)

Set the z coordinate of an AbstractAtom to val.

For DisorderedAtoms only the default atom is updated.

source
MMTF.writemmtfMethod
writemmtf(output, element, atom_selectors...; gzip=false)
-writemmtf(output, mmtf_dict; gzip=false)

Write a StructuralElementOrList or a MMTFDict to a MMTF file or output stream.

Atom selector functions can be given as additional arguments - only atoms that return true from all the functions are retained. The keyword argument expand_disordered (default true) determines whether to return all copies of disordered residues and atoms. The keyword argument gzip (default false) determines if the file should be gzipped.

source
+runstride(filepath_in, stride_filepath_out)

Return a copy of the structural element with STRIDE run to assign secondary structure, or run STRIDE directly on a PDB file.

source
BioStructures.sequentialresiduesMethod
sequentialresidues(res_first, res_second)

Determine if the second residue follows the first in sequence.

For this to be true the residues need to have the same chain ID, both need to be standard/hetero residues and the residue number of the second needs to be one greater than that of the first (or the residue numbers the same and the insertion code of the second greater than the first).

source
BioStructures.sheetselectorMethod
sheetselector(res)
+sheetselector(at)

Determines if an AbstractResidue or AbstractAtom is part of a β-sheet, i.e. whether the secondary structure code is in sheetsscodes.

source
BioStructures.showcontactmapMethod
showcontactmap(contact_map)
+showcontactmap(io, contact_map)

Print a representation of a ContactMap to stdout, or a specified IO instance.

A fully plotted version can be obtained with plot(contact_map) but that requires Plots.jl; showcontactmap works without that dependency.

source
BioStructures.spaceatomnameMethod
spaceatomname(at::Atom)

Space an Atom name such that the last element letter (generally) appears in the second column.

If the element property of the Atom is set it is used to get the element, otherwise the name starts from the second column where possible. This function is generally not required as spacing is recorded when atom names are read in from a Protein Data Bank (PDB) file. However this spacing can be important, for example distinguising between Cα and calcium atoms.

source
BioStructures.sqdistanceMethod
sqdistance(element_one, element_two, atom_selectors...)

Get the minimum square distance in Å between two StructuralElementOrLists.

Additional arguments are atom selector functions - only atoms that return true from the functions are retained.

source
BioStructures.sscodeMethod
sscode(res)
+sscode(at)

Get the secondary structure code of an AbstractResidue or AbstractAtom as a Char.

'-' represents unassigned secondary structure. Secondary structure can be assigned using rundssp! or runstride!.

source
BioStructures.sscodeselectorMethod
sscodeselector(res, ss_codes)
+sscodeselector(at, ss_codes)

Determines if an AbstractResidue or AbstractAtom has its secondary structure code in a list of secondary structure codes.

source
BioStructures.standardselectorMethod
standardselector(at)
+standardselector(res)

Determines if an AbstractAtom represents a standard atom, e.g. came from a ATOM record in a Protein Data Bank (PDB) file, or if an AbstractResidue represents a standard molecule, e.g. consists of ATOM records from a PDB file.

source
BioStructures.structureMethod
structure(el)

Return the ProteinStructure that an AbstractAtom, AbstractResidue, Chain or Model belongs to.

source
BioStructures.superimpose!Method
superimpose!(el1, el2, residue_selectors...)

Calculate the Transformation that maps the first element onto the second, and modify all coordinates in the first element according to the transformation.

See Transformation for keyword arguments.

source
BioStructures.tempfactorMethod
tempfactor(at)

Get the temperature factor of an AbstractAtom as a Float64.

The temperature factor is set to 0.0 if not specified during atom creation.

source
BioStructures.updatelocalpdbMethod
updatelocalpdb(; dir::AbstractString=pwd(), format::Type=PDB)

Update a local copy of the Protein Data Bank (PDB).

Obtains the recent weekly lists of new, modified and obsolete PDB entries and automatically updates the PDB files of the given format inside the local dir directory. Requires an internet connection.

source
BioStructures.waterselectorMethod
waterselector(res)
+waterselector(at)

Determines if an AbstractResidue or AbstractAtom represents a water molecule, i.e. whether the residue name is in waterresnames.

source
BioStructures.writemmcifMethod
writemmcif(output, element, atom_selectors...; gzip=false)
+writemmcif(output, mmcif_dict; gzip=false)

Write a StructuralElementOrList or a MMCIFDict to a mmCIF format file or output stream.

Atom selector functions can be given as additional arguments - only atoms that return true from all the functions are retained. The keyword argument expand_disordered (default true) determines whether to return all copies of disordered residues and atoms. The keyword argument gzip (default false) determines if the output is gzipped.

source
BioStructures.writemultimmcifMethod
writemultimmcif(filepath, cifs; gzip=false)
+writemultimmcif(io, cifs; gzip=false)

Write multiple MMCIFDicts as a Dict{String, MMCIFDict} to a filepath or stream. The keyword argument gzip (default false) determines if the output is gzipped.

source
BioStructures.writepdbMethod
writepdb(output, element, atom_selectors...)

Write a StructuralElementOrList to a Protein Data Bank (PDB) format file or output stream.

Only ATOM, HETATM, MODEL and ENDMDL records are written - there is no header and there are no TER records. Atom selector functions can be given as additional arguments - only atoms that return true from all the functions are retained. The keyword argument expand_disordered (default true) determines whether to return all copies of disordered residues and atoms.

source
BioStructures.x!Method
x!(at, val)

Set the x coordinate of an AbstractAtom to val.

For DisorderedAtoms only the default atom is updated.

source
BioStructures.y!Method
y!(at, val)

Set the y coordinate of an AbstractAtom to val.

For DisorderedAtoms only the default atom is updated.

source
BioStructures.z!Method
z!(at, val)

Set the z coordinate of an AbstractAtom to val.

For DisorderedAtoms only the default atom is updated.

source
MMTF.writemmtfMethod
writemmtf(output, element, atom_selectors...; gzip=false)
+writemmtf(output, mmtf_dict; gzip=false)

Write a StructuralElementOrList or a MMTFDict to a MMTF file or output stream.

Atom selector functions can be given as additional arguments - only atoms that return true from all the functions are retained. The keyword argument expand_disordered (default true) determines whether to return all copies of disordered residues and atoms. The keyword argument gzip (default false) determines if the file should be gzipped.

source
diff --git a/dev/documentation/index.html b/dev/documentation/index.html index edeff31..2f0cf36 100644 --- a/dev/documentation/index.html +++ b/dev/documentation/index.html @@ -3424,4 +3424,4 @@ var viewer = $3Dmol.viewers['3dmol_viewer_2']; viewer.render(); }); -

See the Bio3DView.jl tutorial for more information. BioMakie.jl can also be used to visualise BioStructures objects.

Other packages in the Julia ecosystem that deal with structural bioinformatics or related fields include:

+

See the Bio3DView.jl tutorial for more information. BioMakie.jl can also be used to visualise BioStructures objects.

Other packages in the Julia ecosystem that deal with structural bioinformatics or related fields include:

diff --git a/dev/examples/index.html b/dev/examples/index.html index bfa3da4..393e24b 100644 --- a/dev/examples/index.html +++ b/dev/examples/index.html @@ -73,4 +73,4 @@ describe(df) # Show summary mean(df.tempfactor) # Column-wise operations sort(df, :x) # Sorting -CSV.write("1ALW.csv", df) # CSV file writing +CSV.write("1ALW.csv", df) # CSV file writing diff --git a/dev/index.html b/dev/index.html index 1e98086..c4d725d 100644 --- a/dev/index.html +++ b/dev/index.html @@ -1,2 +1,2 @@ -Home · BioStructures.jl

BioStructures.jl

Latest Release:

Latest release License Documentation stable BioJulia maintainer: jgreener64

Development status:

Build status codecov.io Documentation dev

Description

BioStructures provides functionality to read, write and manipulate macromolecular structures, in particular proteins. Protein Data Bank (PDB), mmCIF and MMTF format files can be read in to a hierarchical data structure. Spatial calculations and functions to access the PDB are also provided. It compares favourably in terms of performance to other PDB parsers - see some benchmarks online.

Installation

Install BioStructures from the Julia package REPL, which can be accessed by pressing ] from the Julia REPL:

add BioStructures

See the documentation for information on how to use BioStructures.

Citation

If you use BioStructures, please cite the paper:

Greener JG, Selvaraj J and Ward BJ. BioStructures.jl: read, write and manipulate macromolecular structures in Julia, Bioinformatics 36(14):4206-4207 (2020) - link - PDF

Contributing and questions

We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.

If you have a question about contributing or using this package, you are encouraged to use the #biology channel of the Julia Slack or the Bio category of the Julia discourse site.

+Home · BioStructures.jl

BioStructures.jl

Latest Release:

Latest release License Documentation stable BioJulia maintainer: jgreener64

Development status:

Build status codecov.io Documentation dev

Description

BioStructures provides functionality to read, write and manipulate macromolecular structures, in particular proteins. Protein Data Bank (PDB), mmCIF and MMTF format files can be read in to a hierarchical data structure. Spatial calculations and functions to access the PDB are also provided. It compares favourably in terms of performance to other PDB parsers - see some benchmarks online.

Installation

Install BioStructures from the Julia package REPL, which can be accessed by pressing ] from the Julia REPL:

add BioStructures

See the documentation for information on how to use BioStructures.

Citation

If you use BioStructures, please cite the paper:

Greener JG, Selvaraj J and Ward BJ. BioStructures.jl: read, write and manipulate macromolecular structures in Julia, Bioinformatics 36(14):4206-4207 (2020) - link - PDF

Contributing and questions

We appreciate contributions from users including reporting bugs, fixing issues, improving performance and adding new features.

If you have a question about contributing or using this package, you are encouraged to use the #biology channel of the Julia Slack or the Bio category of the Julia discourse site.