Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add atom container align and rmsd methods (Zube 2914). #173

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

ktbolt
Copy link

@ktbolt ktbolt commented Aug 24, 2017

Here's the alignment code. I've also included a pytest and notebook example.

I'd like to add visualization to the notebook.

@CLAassistant
Copy link

CLAassistant commented Aug 24, 2017

CLA assistant check
All committers have signed the CLA.

Copy link
Contributor

@avirshup avirshup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This code looks really, really good. Here are some picky style comments.

two paired sets of points.

Args:
src_points(numpy.ndarray): An Nx3 array of the 3D source points.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The docstring parser expects a space between variable name and the type, i.e.,
src_points (numpy.ndarray): [...]

# limitations under the License.

import numpy as np
from math import sqrt
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is super-picky, but: standard library imports should go above third-party module imports. Why? Because PEP8 says so

Args:
points1(numpy.ndarray): An Nx3 array of the 3D points.
points2(numpy.ndarray): An Nx3 array of the 3D points.
centered(Boolean): If true then the points are centered around (0,0,0).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These types should be spelled bool and float

"""

if points1.shape[0] != points2.shape[0]:
raise ValueError('The number of points for calculating the RMSD between the first array %d does not equal the number of points in the second array %d' % (num_points1, num_points2))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should reflow this onto multiple lines to keep everything within 100 columns

other (AtomContainer): The list of atoms to calculate the RMSD with.

Returns:
rmsd_error(Float): The root mean square deviation measuring the alignment error.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a Scalar[length] object

other (AtomContainer): The list of atoms to align to.

Returns:
rmsd_error(Float): The root mean square deviation measuring the alignment error.
Copy link
Contributor

@avirshup avirshup Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should return a Scalar[length] object (i.e., a number with length units)


# Calculate the 4x4 matrix transforming self_coords into other_coords.
rmsd_error, xform = rmsd_align(self_coords, other_coords)
return rmsd_error, xform
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method should apply the transformation before returning. Three cases to handle:

  1. if all the atoms in this container are part of the same molecule, then transform the entire molecule:
self.atoms[0].molecule.transform(xform)
  1. if all the atoms in this container don't belong to any molecule, just transform the atoms in this container
self.transform(xform)
  1. if all the atoms in this container belong to different molecules, raise ValueError

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this method should apply the transformation because what it transforms is based on the contents of the atom lists, a really significant operation, transforming the entire molecule, need to be aware of what your atom lists contain.

I don't see any reason to restrict the container atoms to a single molecule either, could be useful to position a group of molecules.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely see your point here. In terms of API philosophy, I still like the idea of a one-line method to automatically align an entire system, but we definitely need something that just calculates a transformation without applying it.

Before going further, let's meet up (offline) and write out some API use cases to see what makes sense.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, let's discuss offline.

@exports
def rmsd_align(src_points, dst_points):
""" Calculate the 4x4 transformation matrix that aligns a set of source points
to a set of destination of points.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Docstring text should be aligned at the same indentation level as the quotes, except for the first line. So this should look like this:

    """ Calculate the 4x4 transformation matrix that aligns a set of source points 
    to a set of destination of points.
    [...]
    """

Not like this:

    """ Calculate the 4x4 transformation matrix that aligns a set of source points 
        to a set of destination of points.
        [...]
    """

d = np.linalg.det(V) * np.linalg.det(U)

# Make sure the rotation preserves orientation (det = 1).
if d < 0.0:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍 👍 👍 👍 👍 👍

Copy link
Contributor

@avirshup avirshup left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two more comments. One big thing (reversing myself on this) - the mathutils routines should be able to handle quantities with units in addition to bare numpy arrays (some sample code to deal with this in the comments below).

This makes the xform matrix tricky, since only some of its elements have units. My preference is probably to replace it with a class, but we should discuss.

ETA - here's my sketch for what the interface for an xform object could look like:

class Transform(object):
    """ A combined translation and rotation.

    Attributes:
        rotation (np.ndarray): 3x3 rotation matrix
        translation (Vector[len=3]): translation vector (including units if applicable)

    This is equivalent to the tranform described by the 4x4 matrix:
    [  self.rotation  self.translation]
    [  0              1               ]

    """
    def __init__(self, translation, rotation):
        self.translation = translation
        self.rotation = rotation

    def transform_coords(self, other):
        """ Transform the coordinates of the passed object and return them.

        Does not modify the passed object (use ``self.apply`` for that).
        This method is overloaded to accept _either_ a set of coordinates OR any atom container.

        Args:
            other (Matrix[shape=(*,3)] OR AtomContainer): object to transform the coordinates of

        Returns:
            Matrix[shape=(*,3)]: transformed coordinates.
        """

    def apply(self, other):
        """ Apply this transformation to the coordinates of the passed object.

        This will modify the coordinates of the passed object in-place.
        This method is overloaded to accept _either_ a set of coordinates OR any atom container.

        Args:
            other (Matrix[shape=(*,3)] OR AtomContainer): object to transform
        """

dst_center = dst_points.sum(axis=0) / num_points

# Calculate correlation matrix.
C = np.dot(np.transpose(dst_points[:]-dst_center), src_points[:]-src_center)
Copy link
Contributor

@avirshup avirshup Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. Would prefer to rename C to something like correlation_mat to improve readability.
  2. The variable names should be lowercase - it makes life easier for various introspection tools


@exports
def rmsd_align(src_points, dst_points):
""" Calculate the 4x4 transformation matrix that aligns a set of source points
Copy link
Contributor

@avirshup avirshup Aug 30, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After looking at the code, I'm going to contradict myself - I think the routines in mathutils should optionally handle both MdtQuantities (ie numbers with units) as well as numpy arrays. Trying to strip units every time we want to use these routines is just too error-prone.

Easiest way to do this is probably just strip units out at the beginning and add them at the end. We already do this in several places in the code, but all in a haphazard way (feel free to come up with a cleaner way to do it). Here's the basic idea though.

To strip units at the beginning:

if isinstance(src_points, mdt.units.MdtQuantity):
    units = mdt.units.get_units(src_points)
    p1 = units.value_of(src_points)
    p2 = units.value_of(dst_points)
else:
   units = None

To reapply units at the end:

if units is not None:
    rmsd_error *= units

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that numpy plays well with pint units, no need to convert the input points. The only modifications needed is to set units and ensure matrix/vector multiply retains the correct units.

Check for units:

if isinstance(src_points, mdt.units.MdtQuantity):
    units = mdt.units.get_units(src_points)
else:
    units = 1.0

Multiply martix/vector product by units:

    cdiff = rot_mat.dot(src_points[i]-src_center)*units - (dst_points[i]-dst_center)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That works too!

… code to support units in align functions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants