Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chain.jl #28

Open
wants to merge 44 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
ea40dc9
fiz uma mudança
ana-bblima Nov 19, 2024
b9b3c65
desfiz a mudança
ana-bblima Nov 19, 2024
6536de6
include Chain.jl and export Chain, eachchain
ana-bblima Nov 19, 2024
e59ed9d
first implementation of Chain iterator
ana-bblima Nov 19, 2024
0214299
adding comments in code (examples)
ana-bblima Nov 20, 2024
3c41290
new_examples
ana-bblima Nov 20, 2024
3647b4e
testitem used to test the code
ana-bblima Nov 20, 2024
3c61bcf
final version
ana-bblima Nov 20, 2024
63b4813
protein_test.pdb
ana-bblima Nov 20, 2024
f57d9f8
fix new pdb path
ana-bblima Nov 20, 2024
639d600
removed "" in the path of the the pdb
ana-bblima Nov 20, 2024
cbc6dfb
expport chains in Main.PDBTools
ana-bblima Nov 20, 2024
2696c71
removing $ fro comments
ana-bblima Nov 21, 2024
308c50f
final_version
ana-bblima Nov 21, 2024
7cd321a
Merge branch 'main' into chains
ana-bblima Nov 22, 2024
8ac108b
remove extra space on docs
ana-bblima Nov 22, 2024
147ffa7
testing mass command in Residue.jl
ana-bblima Nov 22, 2024
55e43ce
correct mass value in the tests and function documentation
ana-bblima Nov 22, 2024
93168e3
function documentation and more tests
ana-bblima Nov 22, 2024
f61fc4b
documentation of chain and eachchain
ana-bblima Nov 26, 2024
1223f15
function documentation for eachchain and more tests
ana-bblima Nov 26, 2024
2700615
removing inappropriate dependencys
ana-bblima Nov 26, 2024
53f2b4a
mass value in test fixed
ana-bblima Nov 26, 2024
ec79352
test item fixed
ana-bblima Nov 26, 2024
64b770d
Merge branch 'main' into chains
ana-bblima Nov 26, 2024
0a8d908
another throw argument error for test_item
ana-bblima Nov 27, 2024
e6227e0
remove LiveServer wrong dependency
ana-bblima Nov 27, 2024
0fd9424
improve error messages of getindex
ana-bblima Dec 5, 2024
0f8e812
Merge branch 'm3g:main' into chains
ana-bblima Dec 5, 2024
25253db
fix errors
ana-bblima Dec 5, 2024
38f8ae2
testing the @show messages in the code
ana-bblima Dec 6, 2024
0aae70a
created a new model for the pdb
ana-bblima Dec 6, 2024
98c6eb0
rename CHAINSPDB create a second model
ana-bblima Dec 6, 2024
49c957f
changed the path to chains.pdb
ana-bblima Dec 6, 2024
8798474
removing different segment name
ana-bblima Dec 6, 2024
41a35ec
funtion last implemented and new documentation
ana-bblima Dec 6, 2024
cabb40a
new examples and better descriptions
ana-bblima Dec 6, 2024
e561946
fixing documentation
ana-bblima Dec 6, 2024
4d21f36
documentation of function eachchain
ana-bblima Dec 7, 2024
e634112
modification in function documentations
ana-bblima Dec 7, 2024
e4e25f1
removing space in documentation
ana-bblima Dec 7, 2024
e083c5c
beter function documentation for eachchain
ana-bblima Dec 7, 2024
f21ad4a
modifying atom properties example
ana-bblima Dec 7, 2024
0032df5
better documentation for eachresidue
ana-bblima Dec 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
143 changes: 137 additions & 6 deletions docs/src/selections.md
Original file line number Diff line number Diff line change
Expand Up @@ -185,7 +185,7 @@ input, and returns `true` or `false` depending on the conditions required for th

## Iterate over residues (or molecules)

The `eachresidue` iterator allows iteration over the resiudes of a structure (in PDB files distinct molecules are associated to different residues, thus this iterates similarly over the molecules of a structure). For example:
The `eachresidue` iterator enables iteration over the residues of a structure. In PDB files, distinct molecules are often treated as separate residues, so this iterator can be used to iterate over the molecules within a structure. For example:

```jldoctest
julia> using PDBTools
Expand All @@ -198,8 +198,12 @@ julia> count(atom -> resname(atom) == "ALA", protein)
julia> count(res -> resname(res) == "ALA", eachresidue(protein))
1
```
Here, the first `count` counts the number of atoms with the residue name "ALA", while the second uses `eachresidue` to count the number of residues named "ALA". This highlights the distinction between residue-level and atom-level operations.

### Collecting Residues into a Vector

Residues produced by `eachresidue` can be collected into a vector for further processing:

The result of the iterator can also be collected, with:
```jldoctest
julia> using PDBTools

Expand All @@ -220,9 +224,14 @@ julia> residues[1]
12 O ALA A 1 1 -7.083 -13.048 -7.303 1.00 0.00 1 PROT 12
```

These residue vector *do not* copy the data from the original atom vector. Therefore, changes performed on these vectors will be reflected on the original data.
### Key Note on Residue Vectors

Residue vectors *do not* create copies of the original atom data. This means that any changes made to the residue vector will directly modify the corresponding data in the original atom vector.

### Iterating Over Atoms Within Residues

You can iterate over the atoms of one or more residues using nested loops. For example, to calculate the total mass of all atoms in residues named "ALA":

It is possible also to iterate over the atoms of one or more residue:
```julia-repl
julia> using PDBTools

Expand All @@ -239,14 +248,14 @@ julia> m_ALA = 0.
m_ALA
73.09488999999999
```
Which, in this simple example, results in the same as:
This method produces the same result as the more concise approaches:

```julia-repl
julia> sum(mass(at) for at in protein if resname(at) == "ALA" )
73.09488999999999
```

or
Or, alternatively:

```julia-repl
julia> sum(mass(res) for res in eachresidue(protein) if resname(res) == "ALA" )
Expand All @@ -260,6 +269,128 @@ resname
residuename
```

## Iterate over chains

The `eachchain` iterator in PDBTools allows users to iterate over the chains in a PDB structure. A PDB file may contain multiple protein chains, and in some cases, it may also include different models of the same protein. This iterator simplifies operations involving individual chains.


```jldoctest
julia> using PDBTools

julia> ats = read_pdb(PDBTools.CHAINSPDB);

julia> chain.(eachchain(ats)) # Retrieve the names of all chains in the structure
4-element Vector{InlineStrings.String3}:
"A"
"B"
"C"
"A"

julia> model.(eachchain(ats)) # Retrieve the model numbers associated with each chain
4-element Vector{Int32}:
1
1
1
2

julia> chain_A1 = first(eachchain(ats)); # Access the first chain in the iterator

julia> resname.(eachresidue(chain_A1)) # Retrieve residue names for chain A in model 1
3-element Vector{InlineStrings.String7}:
"ASP"
"GLN"
"LEU"

julia> chain_A2 = last(eachchain(ats)); # Access the last chain in the iterator

julia> resname.(eachresidue(chain_A2)) # Retrieve residue names for chain A in model 2
3-element Vector{InlineStrings.String7}:
"ASP"
"GLN"
"VAL"

```
In the example above, the `chain.` command retrieves the names of all chains in the structure, while `model.` command lists the model numbers for each chain. This PDB structure contains two models for chain A, where the third residue changes from leucine (LEU) in model 1 to valine (VAL) in model 2.

### Accessing Chains by Index

As seen in the previous example, The `first` and `last` commands allow quick access to the first and last chains in the iterator, respectively. For more specific indexing, you can collect all chains into an array and then use numerical indices to access them.

```julia-repl
julia> using PDBTools

julia> ats = read_pdb(PDBTools.CHAINSPDB);

julia> chains = collect(eachchain(ats))
Array{Chain,1} with 3 chains.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isto não tínhamos mudado para ˋtypeof(chain)ˋ na função, de forma que aparece ˋVector{Chain}ˋ no ˋshowˋ?


julia> chain_B = chains[2]
Chain of name B with 48 atoms.
index name resname chain resnum residue x y z occup beta model segname index_pdb
49 N ASP B 4 4 135.661 123.866 -22.311 1.00 0.00 1 ASYN 49
50 CA ASP B 4 4 136.539 123.410 -21.227 1.00 0.00 1 ASYN 50
51 C ASP B 4 4 137.875 122.934 -21.788 1.00 0.00 1 ASYN 51
94 HD22 LEU B 6 6 139.485 119.501 -16.418 1.00 0.00 1 ASYN 94
95 HD23 LEU B 6 6 138.780 120.216 -17.864 1.00 0.00 1 ASYN 95
96 O LEU B 6 6 141.411 117.975 -21.923 1.00 0.00 1 ASYN 96

```

### Modifying Atom Properties in a Chain

Any changes made to the atoms of a chain variable directly overwrite the properties of the original atoms in the structure. For example, modifying the occupancy and beta-factor columns of atoms in model 2 of chain A will update the corresponding properties in the original structure.

In the example below, the `occup` and `beta` properties of all atoms in model 2 of chain A are set to 0.00. The changes are reflected in the original `ats` vector, demonstrating that the modifications propagate to the parent data structure.

```julia-repl
julia> using PDBTools

julia> ats = read_pdb(PDBTools.CHAINSPDB);

julia> last(eachchain(ats))
Chain of name A with 45 atoms.
index name resname chain resnum residue x y z occup beta model segname index_pdb
145 N ASP A 1 10 133.978 119.386 -23.646 1.00 0.00 2 ASYN 1
146 CA ASP A 1 10 134.755 118.916 -22.497 1.00 0.00 2 ASYN 2
147 C ASP A 1 10 135.099 117.439 -22.652 1.00 0.00 2 ASYN 3
187 HD22 VAL A 3 12 130.704 113.003 -27.586 1.00 0.00 2 ASYN 43
188 HD23 VAL A 3 12 130.568 111.868 -26.242 1.00 0.00 2 ASYN 44
189 O VAL A 3 12 132.066 112.711 -21.739 1.00 0.00 2 ASYN 45


julia> for chain in eachchain(ats)
if name(chain) == "A" && model(chain) == 2
for atom in chain
atom.occup = 0.00
atom.beta = 0.00
end
else continue
end
end

julia> last(eachchain(ats))
Chain of name A with 45 atoms.
index name resname chain resnum residue x y z occup beta model segname index_pdb
145 N ASP A 1 10 133.978 119.386 -23.646 0.00 0.00 2 ASYN 1
146 CA ASP A 1 10 134.755 118.916 -22.497 0.00 0.00 2 ASYN 2
147 C ASP A 1 10 135.099 117.439 -22.652 0.00 0.00 2 ASYN 3
187 HD22 VAL A 3 12 130.704 113.003 -27.586 0.00 0.00 2 ASYN 43
188 HD23 VAL A 3 12 130.568 111.868 -26.242 0.00 0.00 2 ASYN 44
189 O VAL A 3 12 132.066 112.711 -21.739 0.00 0.00 2 ASYN 45


```

This behavior ensures efficient data manipulation but requires careful handling to avoid unintended changes.

```@docs
Chain
eachchain
```

## Using VMD

[VMD](https://www.ks.uiuc.edu/Research/vmd/) is a very popular and
Expand Down
Loading
Loading