-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
updated methods #8
base: main
Are you sure you want to change the base?
Conversation
for ii in in_list]) | ||
``` | ||
|
||
We ran both a constrained spherical deconvolution, free water diffusion tensor, and sparse facile model through DIPY on our subject. We fit the model 5 times for each set of unique parameters. We ran the test on chunk sizes ranging from 2<sup>1<sup> to 2<sup>15<sup> and with CPU counts, 8, 16, 32, 48, and 72. Below is the argument provided to the docker image. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We ran both a constrained spherical deconvolution, free water diffusion tensor, and sparse facile model through DIPY on our subject. We fit the model 5 times for each set of unique parameters. We ran the test on chunk sizes ranging from 2<sup>1<sup> to 2<sup>15<sup> and with CPU counts, 8, 16, 32, 48, and 72. Below is the argument provided to the docker image. | |
We ran both a constrained spherical deconvolution, free water diffusion tensor, and sparse fasicle model through DIPY on our subject. We fit the model 5 times for each set of unique parameters. We ran the test on chunk sizes ranging from 2<sup>1<sup> to 2<sup>15<sup> and with CPU counts, 8, 16, 32, 48, and 72. Below is the argument provided to the docker image. |
``` | ||
|
||
We ran both a constrained spherical deconvolution, free water diffusion tensor, and sparse facile model through DIPY on our subject. We fit the model 5 times for each set of unique parameters. We ran the test on chunk sizes ranging from 2<sup>1<sup> to 2<sup>15<sup> and with CPU counts, 8, 16, 32, 48, and 72. Below is the argument provided to the docker image. | ||
|
||
``` | ||
--models csdm fwdtim --min_chunks 1 --max_chunks 15 --num_runs 5 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should also have "sfm" here as an option
``` | ||
--models csdm fwdtim --min_chunks 1 --max_chunks 15 --num_runs 5 | ||
``` | ||
|
||
For parallelization of tractography in pyAFQ the previous approach is not viable, as many of the libraries used for tractography are written in Cython and do not support serialization. To circumvent this issue ray support actors, which allow you to utilize stateful workers that run in a separate process. This allows you to run any code in parallel, so long as you do not need to pass any data between workers that cannot be serialzied. By leveraging the new TRX file format we were able to implement parallelization fairly easily. We do so by creating multiple tractograms that are computed in parallel, each containing a chunk of the total data. These tractograms are written straight to disk as Trx files, spare a small cache in memory. We then concatenated the resulting Trx files into a single file at the end, which proved to have minimal computational cost. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For parallelization of tractography in pyAFQ the previous approach is not viable, as many of the libraries used for tractography are written in Cython and do not support serialization. To circumvent this issue ray support actors, which allow you to utilize stateful workers that run in a separate process. This allows you to run any code in parallel, so long as you do not need to pass any data between workers that cannot be serialzied. By leveraging the new TRX file format we were able to implement parallelization fairly easily. We do so by creating multiple tractograms that are computed in parallel, each containing a chunk of the total data. These tractograms are written straight to disk as Trx files, spare a small cache in memory. We then concatenated the resulting Trx files into a single file at the end, which proved to have minimal computational cost. | |
For parallelization of tractography in pyAFQ the previous approach is not viable, as many of the libraries used for tractography are written in Cython and do not support serialization. To circumvent this issue ray support actors, which allow you to utilize stateful workers that run in a separate process. This allows you to run any code in parallel, so long as you do not need to pass any data between workers that cannot be serialzied. By leveraging the new TRX file format we were able to implement parallelization fairly easily. We do so by creating multiple tractograms that are computed in parallel, each containing a chunk of the total data. These tractograms are written straight to disk as Trx files, while only using a small amount of RAM at any given time. We then concatenated the resulting Trx files into a single file at the end, which proved to have minimal computational cost. |
del self.objects[id] | ||
``` | ||
|
||
In testing, we isolate the computation of streamlines by computing the whole pipeline up to streamline generation, and then start the time and compute streamlines. We compute tractography with 1 seed per chunk, and the dmriprep preprocessing pipeline. Due to memory constraints using chunk sizes larger than ~80 streamlines causes the tracking to crash, so we iterated the test across chunk sizes of 1 to 72 (XXX sorta a lie will fix later) chunks. Similar to diffusion modeling we ran the test with CPU counts of, 8, 16, 32, and 72. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In testing, we isolate the computation of streamlines by computing the whole pipeline up to streamline generation, and then start the time and compute streamlines. We compute tractography with 1 seed per chunk, and the dmriprep preprocessing pipeline. Due to memory constraints using chunk sizes larger than ~80 streamlines causes the tracking to crash, so we iterated the test across chunk sizes of 1 to 72 (XXX sorta a lie will fix later) chunks. Similar to diffusion modeling we ran the test with CPU counts of, 8, 16, 32, and 72. | |
In testing, we isolate the computation of streamlines by computing the whole pipeline up to streamline generation, and then start the time and compute streamlines. We compute tractography with 1 seed per chunk. Due to memory constraints using number of chunks that is larger than ~120 causes the tracking to crash, so we iterated the test across n_chunks of 1 to n*n_cores, except in the n_cores = 72 case, where 72 chunks was the maximal number. Similar to diffusion modeling we ran the test with CPU counts of 8, 16, 32, and 72. |
|
||
![](figures/pyAFQ_speedup.png) | ||
|
||
```python |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Repeated from above. I'd delete it here and leave it in the methods.
del self.objects[id] | ||
``` | ||
|
||
The [TRX](https://github.com/tee-ar-ex/trx-python) file format also proved very useful. To avoid having multiple workers write to the same object we have each worker write to its own TRX file on disk, and then at the end, all files are concatenated into one. This proved to be a very robust solution that still has fairly low memory usage. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We can move some of this to methods (this is how the implementation was made easy) and some of it can be moved to discussion (what are the advantages of trx).
No description provided.