Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

updated methods #8

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open

updated methods #8

wants to merge 1 commit into from

Conversation

asagilmore
Copy link
Contributor

No description provided.

for ii in in_list])
```

We ran both a constrained spherical deconvolution, free water diffusion tensor, and sparse facile model through DIPY on our subject. We fit the model 5 times for each set of unique parameters. We ran the test on chunk sizes ranging from 2<sup>1<sup> to 2<sup>15<sup> and with CPU counts, 8, 16, 32, 48, and 72. Below is the argument provided to the docker image.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
We ran both a constrained spherical deconvolution, free water diffusion tensor, and sparse facile model through DIPY on our subject. We fit the model 5 times for each set of unique parameters. We ran the test on chunk sizes ranging from 2<sup>1<sup> to 2<sup>15<sup> and with CPU counts, 8, 16, 32, 48, and 72. Below is the argument provided to the docker image.
We ran both a constrained spherical deconvolution, free water diffusion tensor, and sparse fasicle model through DIPY on our subject. We fit the model 5 times for each set of unique parameters. We ran the test on chunk sizes ranging from 2<sup>1<sup> to 2<sup>15<sup> and with CPU counts, 8, 16, 32, 48, and 72. Below is the argument provided to the docker image.

```

We ran both a constrained spherical deconvolution, free water diffusion tensor, and sparse facile model through DIPY on our subject. We fit the model 5 times for each set of unique parameters. We ran the test on chunk sizes ranging from 2<sup>1<sup> to 2<sup>15<sup> and with CPU counts, 8, 16, 32, 48, and 72. Below is the argument provided to the docker image.

```
--models csdm fwdtim --min_chunks 1 --max_chunks 15 --num_runs 5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should also have "sfm" here as an option

```
--models csdm fwdtim --min_chunks 1 --max_chunks 15 --num_runs 5
```

For parallelization of tractography in pyAFQ the previous approach is not viable, as many of the libraries used for tractography are written in Cython and do not support serialization. To circumvent this issue ray support actors, which allow you to utilize stateful workers that run in a separate process. This allows you to run any code in parallel, so long as you do not need to pass any data between workers that cannot be serialzied. By leveraging the new TRX file format we were able to implement parallelization fairly easily. We do so by creating multiple tractograms that are computed in parallel, each containing a chunk of the total data. These tractograms are written straight to disk as Trx files, spare a small cache in memory. We then concatenated the resulting Trx files into a single file at the end, which proved to have minimal computational cost.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
For parallelization of tractography in pyAFQ the previous approach is not viable, as many of the libraries used for tractography are written in Cython and do not support serialization. To circumvent this issue ray support actors, which allow you to utilize stateful workers that run in a separate process. This allows you to run any code in parallel, so long as you do not need to pass any data between workers that cannot be serialzied. By leveraging the new TRX file format we were able to implement parallelization fairly easily. We do so by creating multiple tractograms that are computed in parallel, each containing a chunk of the total data. These tractograms are written straight to disk as Trx files, spare a small cache in memory. We then concatenated the resulting Trx files into a single file at the end, which proved to have minimal computational cost.
For parallelization of tractography in pyAFQ the previous approach is not viable, as many of the libraries used for tractography are written in Cython and do not support serialization. To circumvent this issue ray support actors, which allow you to utilize stateful workers that run in a separate process. This allows you to run any code in parallel, so long as you do not need to pass any data between workers that cannot be serialzied. By leveraging the new TRX file format we were able to implement parallelization fairly easily. We do so by creating multiple tractograms that are computed in parallel, each containing a chunk of the total data. These tractograms are written straight to disk as Trx files, while only using a small amount of RAM at any given time. We then concatenated the resulting Trx files into a single file at the end, which proved to have minimal computational cost.

del self.objects[id]
```

In testing, we isolate the computation of streamlines by computing the whole pipeline up to streamline generation, and then start the time and compute streamlines. We compute tractography with 1 seed per chunk, and the dmriprep preprocessing pipeline. Due to memory constraints using chunk sizes larger than ~80 streamlines causes the tracking to crash, so we iterated the test across chunk sizes of 1 to 72 (XXX sorta a lie will fix later) chunks. Similar to diffusion modeling we ran the test with CPU counts of, 8, 16, 32, and 72.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
In testing, we isolate the computation of streamlines by computing the whole pipeline up to streamline generation, and then start the time and compute streamlines. We compute tractography with 1 seed per chunk, and the dmriprep preprocessing pipeline. Due to memory constraints using chunk sizes larger than ~80 streamlines causes the tracking to crash, so we iterated the test across chunk sizes of 1 to 72 (XXX sorta a lie will fix later) chunks. Similar to diffusion modeling we ran the test with CPU counts of, 8, 16, 32, and 72.
In testing, we isolate the computation of streamlines by computing the whole pipeline up to streamline generation, and then start the time and compute streamlines. We compute tractography with 1 seed per chunk. Due to memory constraints using number of chunks that is larger than ~120 causes the tracking to crash, so we iterated the test across n_chunks of 1 to n*n_cores, except in the n_cores = 72 case, where 72 chunks was the maximal number. Similar to diffusion modeling we ran the test with CPU counts of 8, 16, 32, and 72.


![](figures/pyAFQ_speedup.png)

```python
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Repeated from above. I'd delete it here and leave it in the methods.

del self.objects[id]
```

The [TRX](https://github.com/tee-ar-ex/trx-python) file format also proved very useful. To avoid having multiple workers write to the same object we have each worker write to its own TRX file on disk, and then at the end, all files are concatenated into one. This proved to be a very robust solution that still has fairly low memory usage.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can move some of this to methods (this is how the implementation was made easy) and some of it can be moved to discussion (what are the advantages of trx).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants