Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Modifying the returns of the neighbor finder #765

Merged
merged 2 commits into from
Jun 11, 2024

Conversation

pfebrer
Copy link
Contributor

@pfebrer pfebrer commented Apr 27, 2024

As discussed in #744, this PR modifies the return values of the neighbor finder so that the user has to ask explicitly for a certain quantity, making sure that they get what they want.

There is a general Interactions class, but the different ways in which the finder can work require slightly different manipulations, and I have created multiple classes to facilitate that the user notices what they have at hand.

For finding neighbors (atom-atom interactions), we have the following classes:

  • FullNeighborsList: Contains all the neighbor interactions in the system. Returned by find_neighbors if atoms is None.
  • PartialNeighborsList: Contains the neighbors for a list of atoms (neighbors can be outside that list). Returned by find_neighbors if atoms is not None.
  • UniqueNeighborsList: Contains all the neighbor interactions in the system, but only on one direction. Returned by find_unique_pairs. This is like the upper triangular part of the sparsity pattern/adjacency matrix.
  • AtomNeighborsList: Contains the neighbors for a single atom. This is returned when iterating over FullNeighborsList or PartialNeighborsList.

For finding atoms that are close to certain points in space we have:

  • CloseAtoms: Contains all the interactions between points and atoms. Returned by find_close.
  • PointCloseAtoms: Contains the atoms that are close to a certain point. Returned when iterating over CloseAtoms.

This is not finished, but before finishing the details I would like to get your opinion to see if you think it is on the right track or not.

One thing I noticed is that the FullNeighborsList and UniqueNeighborsList are very similar to a SparseAtoms object. I don't know if we should return that instead, or subclass SparseAtoms. Although with a neighbors list it is probably more convenient to have the matrix in COO format.

Here is an example of how one might get and use a FullNeighborsList:

import sisl
from sisl.geom import NeighborFinder

# Create a graphene sheet with a defect
graphene = sisl.geom.graphene().tile(6, 0).tile(6, 1)
graphene = graphene.remove(15).translate2uc()

# Initialize a finder
finder = NeighborFinder(graphene, R=1.5)

# Get the neighbors
# Here "neighs" is a FullNeighborsList
neighs = finder.find_neighbors()

# neighs has attributes like `ì`, `j`, `isc`, `j_sc`, `n_neighbors`
# Here we plot the graphene sheet, coloring the atoms by the number of neighbors they have
graphene.plot(
    axes="xy", nsc=[2, 2, 1], atoms_style={"color": neighs.n_neighbors}
).update_layout(
    title="Number of neighbors in the graphene sheet",
    height=700
).show()

# We can loop over it to get the neighbors of each atom
for neigh in neighs:
    # neigh here is an AtomNeighborsList containing
    # the neighbors of a given atom.
    # We can get the atom to which the neighbors belong
    neigh.atom
    # We can get the neighbors, or their isc
    neigh.j
    neigh.isc
    # Or the neighbors in supercell indices
    neigh.j_sc
    break

# Instead of looping we can also get the neighbors of a given atom directly
neighs[10]

# And we can plot the neighbors of a given atom. We color the atom in black and
# its neighbors in pink

# Atom for which we want to plot neighbors
at = 0

graphene.plot(
    axes="xy", 
    nsc=[2, 2, 1], 
    atoms_style=[
        {"atoms": at, "color": "black"},
        {"atoms": neighs[at].j, "color": "pink"}
    ]
).update_layout(
    title=f"Neighbors of atom {at}",
    height=700
).show()

newplot (63)
newplot (64)

src/sisl/geom/_neighbors/_finder.py Fixed Show resolved Hide resolved
Copy link
Owner

@zerothi zerothi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like this, however, I am a bit worried about the complexity of the different classes? Are there too many, should we coerce them into 1 with flags to denote their content?
What would a user expect, for generic functionality they would have to do lots of isinstance where this could possibly be leveraged by have_symmetric_neighbors or similar, I am not saying that is the way to go, but we should at least consider it?


@property
def j_sc(self):
lattice = Lattice(cell=[[1, 0, 0], [0, 1, 0], [0, 0, 1]], nsc=self.nsc)
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this won't work if geometry is using the wrong nsc, either this should only work when np.all(nsc<=self.geometry.nsc) or we shouldn't provide it.

I would accept a solution that raises an error of some sorts (probably RuntimeError) that is raised when the above isn't fulfilled.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm my idea with the raw finder is that all you need to know are the boundary conditions and the finder is the one that helps you determine nsc.

My idea was that users would do:

finder = NeighborFinder(geometry)

neighs = finder.find_neighbors()

geometry.set_nsc(neighs.nsc)

But perhaps the geometry stored in the Interactions class should be a copy of the provided one with the right nsc? So that they would take it like:

finder = NeighborFinder(geometry)

neighs = finder.find_neighbors()

geom_with_aux_cell = neighs.geometry

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand the niche caches, but if the finder is used for setting up matrices, then this is not clear. I don't know if the classes should simply fail if the nsc is not compatible...?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can understand the niche caches, but if the finder is used for setting up matrices, then this is not clear.

What is not clear?

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that j_sc cannot be used

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But how should it work? nsc is determined after finding the neighbors up to a certain R. So it is when you run the finder that you get nsc, not before.

So my view is that you should retreive nsc from the finder and then initialize a matrix with that.

Maybe the Neighbors class should have a method like:

def set_nsc(self, nsc=None):
    if nsc is None:
        # Compute nsc from the max of isc
    else:
        # Set nsc to whatever the user has passed (maybe have another method
        # that removes further interactions)

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know how it should work, but the ambiguity of doing something like:

geom = ...
H = Hamiltonian(geom)
neighbors = NeighborFinder(geom)
atomneighbor = neighbors[10].j_sc
H[geom.a2o(10), geom.a2o(atomneighbor.j_sc)] = ...

is not totally clear. And there is no way to catch it (except when j_sc is out of bounds!).
Either, the neighbor finder should respect the geometry's nsc and raise an error if something inconsistent is found.
Or we shouldn't allow the j_sc (even though I like it).

It also seems weird to create a new geometry based on the neighbors before creating stuff. (minor point).
It is one of those cases where it makes sense for us, but not for end-users.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I would say in that case is that you shouldn't create the matrix until you know what the connectivity is.

I don't understand how expecting users to know what nsc should be makes it more logic/usable. In my opinion it should be possible to use the results of the neighbor finder to determine nsc. It is basically what Geometry.optimize_nsc does right now. If something it would make it more transparent for begginers to understand what nsc is.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should start by removing j_sc. Then we can always figure out how to do things. By doing this we force users to use the i, j and isc properties.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, makes sense

src/sisl/geom/_neighbors/_finder.py Outdated Show resolved Hide resolved
src/sisl/geom/_neighbors/_finder.py Outdated Show resolved Hide resolved
from sisl.typing import AtomsIndex
from sisl.utils import size_to_elements

from . import _operations


class Interactions:
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason these don't have an __iter__ method?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I don't know what __iter__ should do here 😅 This is just a base class that is not used directly (for now).

src/sisl/geom/_neighbors/_finder.py Fixed Show resolved Hide resolved
src/sisl/geom/_neighbors/_finder.py Outdated Show resolved Hide resolved
@pfebrer
Copy link
Contributor Author

pfebrer commented Apr 30, 2024

I really like this, however, I am a bit worried about the complexity of the different classes? Are there too many, should we coerce them into 1 with flags to denote their content?
What would a user expect, for generic functionality they would have to do lots of isinstance where this could possibly be leveraged by have_symmetric_neighbors or similar, I am not saying that is the way to go, but we should at least consider it?

The first implementation was actually a single class with flags, but the logic inside each method/property was kind of a mess. The code was not easy to read. Also depending on the thing stored, the functionality changes significantly and attributes with the same name mean different things. So I think it is clearer for the user if each type of return is a different class. Also there wasn't much interplay between the different flags really, so in the end you would have flags that are something like is_class_x which would be equivalent to the current isinstance(class_x).

@zerothi
Copy link
Owner

zerothi commented May 1, 2024

True... It might be fine like this!

@pfebrer
Copy link
Contributor Author

pfebrer commented May 1, 2024

I changed it so that the iterator works with __getitem__, to allow directly getting the neighbors of a given atom instead of requiring the user to loop. (code snippet in the description is updated)

Copy link
Owner

@zerothi zerothi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh and one final thing, we need to add these to the docs.
I would suggest something like this:

  • create a new folder docs/api/geom
  • then we can add the current docs/api/geom.rst there in another way
  • and create an docs/api/geom/index.rst with more details

Let me know if you need help!

@zerothi
Copy link
Owner

zerothi commented May 27, 2024

@pfebrer what should I help with here?

@pfebrer
Copy link
Contributor Author

pfebrer commented May 27, 2024

I can do it myself, but probably next week.

# of the unit cell, because it is not supported for now.
# IT SHOULD BE SUPPORTED IN THE FUTURE
with pytest.raises(ValueError):
n = NeighborFinder(geom, R=1.1)

Check notice

Code scanning / CodeQL

Unused local variable Note test

Variable n is not used.
Copy link

codecov bot commented Jun 3, 2024

Codecov Report

Attention: Patch coverage is 51.25000% with 117 lines in your changes missing coverage. Please review.

Project coverage is 87.28%. Comparing base (81dd25e) to head (48d4c02).
Report is 2 commits behind head on main.

Files Patch % Lines
src/sisl/geom/_neighbors/tests/test_finder.py 0.00% 96 Missing ⚠️
src/sisl/geom/_neighbors/_neighborlists.py 83.60% 20 Missing ⚠️
src/sisl/geom/_neighbors/_operations.py 0.00% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #765      +/-   ##
==========================================
- Coverage   87.35%   87.28%   -0.07%     
==========================================
  Files         396      397       +1     
  Lines       50752    50915     +163     
==========================================
+ Hits        44332    44439     +107     
- Misses       6420     6476      +56     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@pfebrer
Copy link
Contributor Author

pfebrer commented Jun 4, 2024

Done! Everything is there, including docs. You can reorganize the docs as you wish.

src/sisl/geom/_neighbors/_finder.py Fixed Show fixed Hide fixed
src/sisl/geom/_neighbors/_finder.py Fixed Show fixed Hide fixed
Signed-off-by: Nick Papior <[email protected]>
@zerothi zerothi merged commit 255efb1 into zerothi:main Jun 11, 2024
3 checks passed
@zerothi
Copy link
Owner

zerothi commented Jun 11, 2024

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Change neighborfinder returns to a tuple of indices and supercell indices
2 participants