Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Avoid double free in distributed tree #369

Merged
merged 3 commits into from
Sep 17, 2020

Conversation

dalg24
Copy link
Contributor

@dalg24 dalg24 commented Sep 11, 2020

Proposed resolution for #368
Stores the duplicated comm in a smart pointer with a custom deleter.

Copy link
Contributor

@aprokop aprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see how it resolves overlapping contexts from multiple copies of the same tree, see comment.

@masterleinad
Copy link
Collaborator

Multiple copies would now share the same communicator. What if both of them send/receive messages at the same time? I suspect that would result in potential conflicts.

@dalg24
Copy link
Contributor Author

dalg24 commented Sep 11, 2020

I don't see how it resolves overlapping contexts from multiple copies of the same tree, see comment.

I saw your comment. This is indeed not addressing the situation you describe but the solution I propose is an improvement over the status quo. Copies may occur without the user having the intention to search tree in multiple contexts.

@dalg24
Copy link
Contributor Author

dalg24 commented Sep 11, 2020

From #368 (comment)

The fix is straightforward: copy constructor and copy assignment operators must duplicate the communicator.

I was going for compiler generated copy constructor and copy assignment operators but I am fine with the alternate solution you propose.
On second thought I am not convinced we need to support this.

Copy link
Contributor

@aprokop aprokop left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I cannot think of a better way to achieve the goal within the scope of this PR.

@aprokop
Copy link
Contributor

aprokop commented Sep 11, 2020

#include <shared>, I guess

Comment on lines 122 to 131
_comm_ptr.reset(
[comm]() {
auto p = std::make_unique<MPI_Comm>();
MPI_Comm_dup(comm, p.get());
return p.release();
}(),
[](MPI_Comm *p) {
MPI_Comm_free(p);
delete p;
});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comments explaining the reasons behind this complication, including what problems it is trying to solve.

@dalg24
Copy link
Contributor Author

dalg24 commented Sep 14, 2020

Retest this please

@dalg24
Copy link
Contributor Author

dalg24 commented Sep 17, 2020

Do we really care to wait for the HIP build to be fixed?

@masterleinad masterleinad merged commit 5642410 into arborx:master Sep 17, 2020
@masterleinad masterleinad deleted the double_free_mpi_comm branch September 17, 2020 19:51
@masterleinad
Copy link
Collaborator

Blame me if there are any warnings we were missing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants