Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Errors with incast, bcast lane #12

Open
csehydrogen opened this issue Nov 10, 2020 · 1 comment
Open

Errors with incast, bcast lane #12

csehydrogen opened this issue Nov 10, 2020 · 1 comment

Comments

@csehydrogen
Copy link

I try to do ALLREDUCE between two processes, using UCG without MPI.
I was able to create UCG context, UCP worker, and UCG group by exchanging their worker address.
However, creating ucg_coll_h with ucg_coll_allreduce_init causes a bunch of incast/bcast errors:

[...] select.c:517  UCX  ERROR cannot add incast lane - reached limit (6)
[...] select.c:517  UCX  ERROR cannot add bcast lane - reached limit (6)
[...] ucg_plan.c:388  UCX  WARN  No transports with native broadcast support were found, falling back to P2P transports (slower)
[...] ucg_plan.c:380  UCX  WARN  No transports with native incast support were found, falling back to P2P transports (slower)
free(): double free detected in tcache 2
Aborted (core dumped)

The attached file is a minimal working example for reproducing the problem.
ucx_test.zip

# host1 and host2 are connected with 1G ethernet and 100G InfiniBand
$ unzip ucx_test.zip; cd ucx_test
$ make
# on host1
$ ./ucg_test 2 0 0 host1 12345 # meaning: total 2 process, this process's rank is 0, root's rank is 0 with address host1:12345
# on host2
$ ./ucg_test 2 1 0 host1 12345 # meaning: total 2 process, this process's rank is 1, root's rank is 0 with address host1:12345
@csehydrogen
Copy link
Author

If someone provides working examples for collective communications (gather, scatter, allreduce, ...), it would be very helpful.

shizhibao pushed a commit to shizhibao/xucg that referenced this issue Jan 16, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant