Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Chain.jl #28

Open
wants to merge 44 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 14 commits
Commits
Show all changes
44 commits
Select commit Hold shift + click to select a range
ea40dc9
fiz uma mudança
ana-bblima Nov 19, 2024
b9b3c65
desfiz a mudança
ana-bblima Nov 19, 2024
6536de6
include Chain.jl and export Chain, eachchain
ana-bblima Nov 19, 2024
e59ed9d
first implementation of Chain iterator
ana-bblima Nov 19, 2024
0214299
adding comments in code (examples)
ana-bblima Nov 20, 2024
3c41290
new_examples
ana-bblima Nov 20, 2024
3647b4e
testitem used to test the code
ana-bblima Nov 20, 2024
3c61bcf
final version
ana-bblima Nov 20, 2024
63b4813
protein_test.pdb
ana-bblima Nov 20, 2024
f57d9f8
fix new pdb path
ana-bblima Nov 20, 2024
639d600
removed "" in the path of the the pdb
ana-bblima Nov 20, 2024
cbc6dfb
expport chains in Main.PDBTools
ana-bblima Nov 20, 2024
2696c71
removing $ fro comments
ana-bblima Nov 21, 2024
308c50f
final_version
ana-bblima Nov 21, 2024
7cd321a
Merge branch 'main' into chains
ana-bblima Nov 22, 2024
8ac108b
remove extra space on docs
ana-bblima Nov 22, 2024
147ffa7
testing mass command in Residue.jl
ana-bblima Nov 22, 2024
55e43ce
correct mass value in the tests and function documentation
ana-bblima Nov 22, 2024
93168e3
function documentation and more tests
ana-bblima Nov 22, 2024
f61fc4b
documentation of chain and eachchain
ana-bblima Nov 26, 2024
1223f15
function documentation for eachchain and more tests
ana-bblima Nov 26, 2024
2700615
removing inappropriate dependencys
ana-bblima Nov 26, 2024
53f2b4a
mass value in test fixed
ana-bblima Nov 26, 2024
ec79352
test item fixed
ana-bblima Nov 26, 2024
64b770d
Merge branch 'main' into chains
ana-bblima Nov 26, 2024
0a8d908
another throw argument error for test_item
ana-bblima Nov 27, 2024
e6227e0
remove LiveServer wrong dependency
ana-bblima Nov 27, 2024
0fd9424
improve error messages of getindex
ana-bblima Dec 5, 2024
0f8e812
Merge branch 'm3g:main' into chains
ana-bblima Dec 5, 2024
25253db
fix errors
ana-bblima Dec 5, 2024
38f8ae2
testing the @show messages in the code
ana-bblima Dec 6, 2024
0aae70a
created a new model for the pdb
ana-bblima Dec 6, 2024
98c6eb0
rename CHAINSPDB create a second model
ana-bblima Dec 6, 2024
49c957f
changed the path to chains.pdb
ana-bblima Dec 6, 2024
8798474
removing different segment name
ana-bblima Dec 6, 2024
41a35ec
funtion last implemented and new documentation
ana-bblima Dec 6, 2024
cabb40a
new examples and better descriptions
ana-bblima Dec 6, 2024
e561946
fixing documentation
ana-bblima Dec 6, 2024
4d21f36
documentation of function eachchain
ana-bblima Dec 7, 2024
e634112
modification in function documentations
ana-bblima Dec 7, 2024
e4e25f1
removing space in documentation
ana-bblima Dec 7, 2024
e083c5c
beter function documentation for eachchain
ana-bblima Dec 7, 2024
f21ad4a
modifying atom properties example
ana-bblima Dec 7, 2024
0032df5
better documentation for eachresidue
ana-bblima Dec 7, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
152 changes: 152 additions & 0 deletions src/Chain.jl
Original file line number Diff line number Diff line change
@@ -0,0 +1,152 @@
"""

###Examples
ana-bblima marked this conversation as resolved.
Show resolved Hide resolved
```julia-repl
julia> pdb = read_pdb("PDBTools.CHAINSPDB")

julia> for chains in eachchain(pdb)
println(name(chains))
println(length(collect(eachresidue(chains))))
println(length(chains))
end
A
3
48
B
3
48
C
3
48

```

"""

ana-bblima marked this conversation as resolved.
Show resolved Hide resolved

@kwdef struct Chain{T<:Atom,Vec<:AbstractVector{T}} <: AbstractVector{T}
atoms::Vec
range::UnitRange{Int}
chain::String3
model::Int32
segname::String7
end

name(chain::Chain) = chain.chain
chain(chain::Chain) = chain.chain
model(chain::Chain) = chain.model
segname(chain::Chain) = chain.segname
mass(chain::Chain) = mass(@view chain.atoms[chain.range])

function Chain(atoms::AbstractVector{<:Atom}, range::UnitRange{<:Integer})
i = first(range)
if any(atoms[j].chain != atoms[i].chain for j in range)
error("Range $range does not correspond to a single residue or molecule.")
end
Chain(
atoms = atoms,
range = range,
chain = chain(atoms[i]),
model = model(atoms[i]),
segname = segname(atoms[i]),
)
end

Chain(atoms::AbstractVector{<:Atom}) = Chain(atoms, 1:length(atoms))

function Base.getindex(chain::Chain, i::Integer)
if i <= 0 || i > length(chain)
throw(ArgumentError("Index must be in 1:$(length(chain)). Attempted to access index $i."))
end
# Calculate the actual index in the atoms array
atom_index = first(chain.range) + i - 1
return chain.atoms[atom_index]
end

#
# Structure and function to define the eachchain iterator
#

struct EachChain{T<:AbstractVector{<:Atom}}
atoms::T
end
eachchain(atoms::AbstractVector{<:Atom}) = EachChain(atoms)
ana-bblima marked this conversation as resolved.
Show resolved Hide resolved

# Collect chains default constructor
Base.collect(c::EachChain) = collect(Chain, c)
Base.length(chains::EachChain) = sum(1 for chain in chains)
Base.firstindex(chains::EachChain) = 1
Base.lastindex(chains::EachChain) = length(chains)

function Base.getindex(::EachChain, ::Integer)
throw(ArgumentError("""\n
The eachchain iterator does not support indexing.
Use collect(eachchain(atoms)) to get an indexable list of chains.

"""))
end

Base.size(chain::Chain) = (length(chain.range),)
Base.length(chain::Chain) = length(chain.range)
Base.eltype(::Chain{T}) where {T} = T

# Iterate over residues of a structure
#
function Base.iterate(chains::EachChain, current_atom=firstindex(chains.atoms))
current_atom > length(chains.atoms) && return nothing
next_atom = current_atom + 1
while next_atom <= length(chains.atoms) &&
same_chain(chains.atoms[current_atom], chains.atoms[next_atom])
next_atom += 1
end
return (Chain(chains.atoms, current_atom:next_atom-1), next_atom)
end

# Iterate over atoms of one residue
#
function Base.iterate(chain::Chain, current_atom=nothing)
first_atom = index(first(chain))
last_atom = index(last(chain))
if isnothing(current_atom)
current_atom = first_atom
elseif current_atom > last_atom
return nothing
end
return (chain[current_atom - first_atom + 1], current_atom + 1)
end

function same_chain(atom1::Atom, atom2::Atom)
atom1.chain == atom2.chain &&
atom1.model == atom2.model &&
atom1.segname == atom2.segname
end

# io show functions
#
function Base.show(io::IO, ::MIME"text/plain", chain::Chain)
println(io, " Chain of name $(chain.chain) with $(length(chain)) atoms.")
print_short_atom_list(io, @view chain.atoms[chain.range])
end

function Base.show(io::IO, chains::EachChain)
print(io, " Iterator with $(length(chains)) chains.")
end

function Base.show(io::IO, ::MIME"text/plain", chains::AbstractVector{Chain})
print(io, " Array{Chain,1} with $(length(chains)) chains.")
end

@testitem "Chain iterator" begin
ana-bblima marked this conversation as resolved.
Show resolved Hide resolved
pdb = read_pdb(PDBTools.CHAINSPDB)
chains = eachchain(pdb)
@test Chain(pdb, 1:48).range == 1:48
@test length(chains) == 3
@test firstindex(chains) == 1
@test lastindex(chains) == 3
@test_throws ArgumentError chains[1]
chains = collect(eachchain(pdb))
@test name(chains[3]) == "C"
@test index.(filter(at -> resname(at) == "ASP" && name(at) == "CA", chains[1])) == [2]
@test length(findall(at -> resname(at) == "GLN", chains[1])) == 17
end

6 changes: 5 additions & 1 deletion src/PDBTools.jl
Original file line number Diff line number Diff line change
Expand Up @@ -24,7 +24,8 @@ export read_pdb, write_pdb, getseq, wget, edit!, oneletter, threeletter, residue
export read_mmcif, write_mmcif
export Atom, printatom, index, index_pdb, name, beta, occup, charge, pdb_element
export add_custom_field
export Residue, eachresidue, resname, residue, resnum, chain, model, segname
export Residue, eachresidue, resname, residue, resnum, model, segname
export Chain, eachchain, chain
export residue_ticks
export coor, maxmin, distance, closest
export element, mass, element_name, element_symbol, element_symbol_string
Expand All @@ -47,6 +48,8 @@ const SMALLPDB = joinpath(@__DIR__,"../test/small.pdb")
const SIRAHPDB = joinpath(@__DIR__,"../test/sirah.pdb")
const TESTCIF = joinpath(@__DIR__,"../test/1yn8.cif")
const SMALLCIF = joinpath(@__DIR__,"../test/small.cif")
const CHAINSPDB = joinpath(@__DIR__,"../test/protein_test.pdb")

ana-bblima marked this conversation as resolved.
Show resolved Hide resolved

# Basic chemistry
include("./elements.jl")
Expand All @@ -57,6 +60,7 @@ include("./protein_residues.jl")
#
include("./Atom.jl")
include("./Residue.jl")
include("./Chain.jl")

# Selection functions
include("./select.jl")
Expand Down
144 changes: 144 additions & 0 deletions test/protein_test.pdb
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
ATOM 1396 N ASP A 98 133.978 119.386 -23.646 1.00 0.00 N
ana-bblima marked this conversation as resolved.
Show resolved Hide resolved
ATOM 1397 CA ASP A 98 134.755 118.916 -22.497 1.00 0.00 C
ATOM 1398 C ASP A 98 135.099 117.439 -22.652 1.00 0.00 C
ATOM 1399 O ASP A 98 135.002 116.664 -21.701 1.00 0.00 O
ATOM 1400 CB ASP A 98 133.952 119.125 -21.211 1.00 0.00 C
ATOM 1401 CG ASP A 98 134.819 118.821 -19.995 1.00 0.00 C
ATOM 1402 OD1 ASP A 98 135.887 118.263 -20.178 1.00 0.00 O
ATOM 1403 OD2 ASP A 98 134.399 119.150 -18.897 1.00 0.00 O
ATOM 1404 H ASP A 98 133.398 118.757 -24.122 1.00 0.00 H
ATOM 1405 HA ASP A 98 135.675 119.481 -22.431 1.00 0.00 H
ATOM 1406 HB2 ASP A 98 133.613 120.150 -21.164 1.00 0.00 H
ATOM 1407 HB3 ASP A 98 133.096 118.466 -21.214 1.00 0.00 H
ATOM 1408 N GLN A 99 135.506 117.058 -23.857 1.00 0.00 N
ATOM 1409 CA GLN A 99 135.867 115.673 -24.128 1.00 0.00 C
ATOM 1410 C GLN A 99 134.754 114.734 -23.677 1.00 0.00 C
ATOM 1411 O GLN A 99 135.013 113.632 -23.195 1.00 0.00 O
ATOM 1412 CB GLN A 99 137.163 115.320 -23.397 1.00 0.00 C
ATOM 1413 CG GLN A 99 138.326 116.095 -24.019 1.00 0.00 C
ATOM 1414 CD GLN A 99 139.583 115.918 -23.175 1.00 0.00 C
ATOM 1415 OE1 GLN A 99 139.495 115.711 -21.965 1.00 0.00 O
ATOM 1416 NE2 GLN A 99 140.754 115.992 -23.744 1.00 0.00 N
ATOM 1417 H GLN A 99 135.568 117.720 -24.575 1.00 0.00 H
ATOM 1418 HA GLN A 99 136.021 115.552 -25.189 1.00 0.00 H
ATOM 1419 HB2 GLN A 99 137.071 115.584 -22.354 1.00 0.00 H
ATOM 1420 HB3 GLN A 99 137.350 114.261 -23.485 1.00 0.00 H
ATOM 1421 HG2 GLN A 99 138.509 115.723 -25.017 1.00 0.00 H
ATOM 1422 HG3 GLN A 99 138.073 117.144 -24.068 1.00 0.00 H
ATOM 1423 HE21 GLN A 99 140.821 116.159 -24.707 1.00 0.00 H
ATOM 1424 HE22 GLN A 99 141.567 115.880 -23.209 1.00 0.00 H
ATOM 1425 N LEU A 100 133.513 115.179 -23.842 1.00 0.00 N
ATOM 1426 CA LEU A 100 132.363 114.371 -23.454 1.00 0.00 C
ATOM 1427 C LEU A 100 132.573 113.786 -22.054 1.00 0.00 C
ATOM 1429 CB LEU A 100 132.160 113.235 -24.477 1.00 0.00 C
ATOM 1430 CG LEU A 100 131.420 113.767 -25.709 1.00 0.00 C
ATOM 1431 CD1 LEU A 100 132.259 114.859 -26.377 1.00 0.00 C
ATOM 1432 CD2 LEU A 100 131.192 112.622 -26.700 1.00 0.00 C
ATOM 1433 H LEU A 100 133.368 116.065 -24.236 1.00 0.00 H
ATOM 1434 HA LEU A 100 131.482 114.999 -23.440 1.00 0.00 H
ATOM 1435 HB2 LEU A 100 133.123 112.849 -24.777 1.00 0.00 H
ATOM 1436 HB3 LEU A 100 131.579 112.440 -24.030 1.00 0.00 H
ATOM 1437 HG LEU A 100 130.467 114.180 -25.407 1.00 0.00 H
ATOM 1438 HD11 LEU A 100 132.207 115.763 -25.787 1.00 0.00 H
ATOM 1439 HD12 LEU A 100 131.877 115.053 -27.368 1.00 0.00 H
ATOM 1440 HD13 LEU A 100 133.287 114.532 -26.445 1.00 0.00 H
ATOM 1441 HD21 LEU A 100 132.142 112.186 -26.972 1.00 0.00 H
ATOM 1442 HD22 LEU A 100 130.704 113.003 -27.586 1.00 0.00 H
ATOM 1443 HD23 LEU A 100 130.568 111.868 -26.242 1.00 0.00 H
ATOM 1444 O LEU A 100 132.066 112.711 -21.739 1.00 0.00 N
ATOM 3413 N ASP B 98 135.661 123.866 -22.311 1.00 0.00 N
ATOM 3414 CA ASP B 98 136.539 123.410 -21.227 1.00 0.00 C
ATOM 3415 C ASP B 98 137.875 122.934 -21.788 1.00 0.00 C
ATOM 3416 O ASP B 98 138.937 123.312 -21.295 1.00 0.00 O
ATOM 3417 CB ASP B 98 135.870 122.262 -20.468 1.00 0.00 C
ATOM 3418 CG ASP B 98 136.596 122.007 -19.149 1.00 0.00 C
ATOM 3419 OD1 ASP B 98 137.243 122.919 -18.662 1.00 0.00 O
ATOM 3420 OD2 ASP B 98 136.491 120.901 -18.644 1.00 0.00 O
ATOM 3421 H ASP B 98 135.148 123.203 -22.818 1.00 0.00 H
ATOM 3422 HA ASP B 98 136.717 124.226 -20.538 1.00 0.00 H
ATOM 3423 HB2 ASP B 98 134.840 122.519 -20.266 1.00 0.00 H
ATOM 3424 HB3 ASP B 98 135.903 121.370 -21.070 1.00 0.00 H
ATOM 3425 N GLN B 99 137.811 122.095 -22.820 1.00 0.00 N
ATOM 3426 CA GLN B 99 139.023 121.570 -23.437 1.00 0.00 C
ATOM 3427 C GLN B 99 139.971 121.029 -22.376 1.00 0.00 C
ATOM 3428 O GLN B 99 141.133 121.428 -22.308 1.00 0.00 O
ATOM 3429 CB GLN B 99 139.724 122.666 -24.240 1.00 0.00 C
ATOM 3430 CG GLN B 99 138.879 123.029 -25.463 1.00 0.00 C
ATOM 3431 CD GLN B 99 139.565 124.134 -26.261 1.00 0.00 C
ATOM 3432 OE1 GLN B 99 140.793 124.229 -26.259 1.00 0.00 O
ATOM 3433 NE2 GLN B 99 138.845 124.980 -26.949 1.00 0.00 N
ATOM 3434 H GLN B 99 136.937 121.824 -23.169 1.00 0.00 H
ATOM 3435 HA GLN B 99 138.754 120.767 -24.107 1.00 0.00 H
ATOM 3436 HB2 GLN B 99 139.852 123.540 -23.619 1.00 0.00 H
ATOM 3437 HB3 GLN B 99 140.690 122.310 -24.565 1.00 0.00 H
ATOM 3438 HG2 GLN B 99 138.760 122.154 -26.089 1.00 0.00 H
ATOM 3439 HG3 GLN B 99 137.908 123.371 -25.140 1.00 0.00 H
ATOM 3440 HE21 GLN B 99 137.869 124.902 -26.953 1.00 0.00 H
ATOM 3441 HE22 GLN B 99 139.283 125.691 -27.461 1.00 0.00 H
ATOM 3442 N LEU B 100 139.468 120.125 -21.545 1.00 0.00 N
ATOM 3443 CA LEU B 100 140.285 119.547 -20.491 1.00 0.00 C
ATOM 3444 C LEU B 100 141.525 118.881 -21.099 1.00 0.00 C
ATOM 3446 CB LEU B 100 139.444 118.514 -19.693 1.00 0.00 C
ATOM 3447 CG LEU B 100 139.935 118.406 -18.231 1.00 0.00 C
ATOM 3448 CD1 LEU B 100 141.435 118.086 -18.212 1.00 0.00 C
ATOM 3449 CD2 LEU B 100 139.658 119.729 -17.458 1.00 0.00 C
ATOM 3450 H LEU B 100 138.533 119.845 -21.642 1.00 0.00 H
ATOM 3451 HA LEU B 100 140.601 120.335 -19.832 1.00 0.00 H
ATOM 3452 HB2 LEU B 100 138.409 118.819 -19.698 1.00 0.00 H
ATOM 3453 HB3 LEU B 100 139.523 117.540 -20.162 1.00 0.00 H
ATOM 3454 HG LEU B 100 139.403 117.598 -17.748 1.00 0.00 H
ATOM 3455 HD11 LEU B 100 141.708 117.703 -17.239 1.00 0.00 H
ATOM 3456 HD12 LEU B 100 141.998 118.983 -18.417 1.00 0.00 H
ATOM 3457 HD13 LEU B 100 141.656 117.342 -18.965 1.00 0.00 H
ATOM 3458 HD21 LEU B 100 140.507 120.396 -17.534 1.00 0.00 H
ATOM 3459 HD22 LEU B 100 139.485 119.501 -16.418 1.00 0.00 H
ATOM 3460 HD23 LEU B 100 138.780 120.216 -17.864 1.00 0.00 H
ATOM 3461 O LEU B 100 141.411 117.975 -21.923 1.00 0.00 N
ATOM 5430 N ASP C 98 137.110 128.213 -20.946 1.00 0.00 N
ATOM 5431 CA ASP C 98 137.938 127.668 -19.879 1.00 0.00 C
ATOM 5432 C ASP C 98 137.485 128.196 -18.521 1.00 0.00 C
ATOM 5433 O ASP C 98 136.590 129.037 -18.436 1.00 0.00 O
ATOM 5434 CB ASP C 98 139.403 128.039 -20.116 1.00 0.00 C
ATOM 5435 CG ASP C 98 139.856 127.525 -21.479 1.00 0.00 C
ATOM 5436 OD1 ASP C 98 139.610 126.365 -21.765 1.00 0.00 O
ATOM 5437 OD2 ASP C 98 140.444 128.300 -22.217 1.00 0.00 O
ATOM 5438 H ASP C 98 136.686 127.602 -21.581 1.00 0.00 H
ATOM 5439 HA ASP C 98 137.847 126.594 -19.884 1.00 0.00 H
ATOM 5440 HB2 ASP C 98 139.510 129.114 -20.083 1.00 0.00 H
ATOM 5441 HB3 ASP C 98 140.013 127.593 -19.346 1.00 0.00 H
ATOM 5442 N GLN C 99 138.113 127.692 -17.458 1.00 0.00 N
ATOM 5443 CA GLN C 99 137.776 128.110 -16.095 1.00 0.00 C
ATOM 5444 C GLN C 99 139.005 128.023 -15.195 1.00 0.00 C
ATOM 5445 O GLN C 99 139.781 127.070 -15.279 1.00 0.00 O
ATOM 5446 CB GLN C 99 136.663 127.220 -15.538 1.00 0.00 C
ATOM 5447 CG GLN C 99 136.243 127.729 -14.157 1.00 0.00 C
ATOM 5448 CD GLN C 99 135.039 126.940 -13.652 1.00 0.00 C
ATOM 5449 OE1 GLN C 99 134.044 126.799 -14.363 1.00 0.00 O
ATOM 5450 NE2 GLN C 99 135.070 126.414 -12.457 1.00 0.00 N
ATOM 5451 H GLN C 99 138.817 127.024 -17.592 1.00 0.00 H
ATOM 5452 HA GLN C 99 137.428 129.136 -16.111 1.00 0.00 H
ATOM 5453 HB2 GLN C 99 135.813 127.246 -16.206 1.00 0.00 H
ATOM 5454 HB3 GLN C 99 137.023 126.206 -15.451 1.00 0.00 H
ATOM 5455 HG2 GLN C 99 137.064 127.609 -13.467 1.00 0.00 H
ATOM 5456 HG3 GLN C 99 135.981 128.774 -14.225 1.00 0.00 H
ATOM 5457 HE21 GLN C 99 135.863 126.527 -11.893 1.00 0.00 H
ATOM 5458 HE22 GLN C 99 134.299 125.908 -12.124 1.00 0.00 H
ATOM 5459 N LEU C 100 139.181 129.025 -14.340 1.00 0.00 N
ATOM 5460 CA LEU C 100 140.324 129.056 -13.434 1.00 0.00 C
ATOM 5461 C LEU C 100 140.292 127.852 -12.498 1.00 0.00 C
ATOM 5463 CB LEU C 100 140.300 130.360 -12.621 1.00 0.00 C
ATOM 5464 CG LEU C 100 141.489 130.423 -11.641 1.00 0.00 C
ATOM 5465 CD1 LEU C 100 142.824 130.362 -12.412 1.00 0.00 C
ATOM 5466 CD2 LEU C 100 141.407 131.741 -10.858 1.00 0.00 C
ATOM 5467 H LEU C 100 138.532 129.758 -14.321 1.00 0.00 H
ATOM 5468 HA LEU C 100 141.229 129.024 -14.019 1.00 0.00 H
ATOM 5469 HB2 LEU C 100 140.351 131.201 -13.298 1.00 0.00 H
ATOM 5470 HB3 LEU C 100 139.377 130.412 -12.062 1.00 0.00 H
ATOM 5471 HG LEU C 100 141.434 129.595 -10.952 1.00 0.00 H
ATOM 5472 HD11 LEU C 100 142.739 130.920 -13.333 1.00 0.00 H
ATOM 5473 HD12 LEU C 100 143.064 129.334 -12.634 1.00 0.00 H
ATOM 5474 HD13 LEU C 100 143.614 130.786 -11.807 1.00 0.00 H
ATOM 5475 HD21 LEU C 100 140.514 131.742 -10.249 1.00 0.00 H
ATOM 5476 HD22 LEU C 100 141.370 132.570 -11.550 1.00 0.00 H
ATOM 5477 HD23 LEU C 100 142.275 131.839 -10.225 1.00 0.00 H
ATOM 5478 O LEU C 100 141.311 127.197 -12.279 1.00 0.00 N
Loading