Clean up unnecessary copies #8673

stscijgbot-jp · 2024-07-24T20:28:07Z

Issue JP-3695 was created on JIRA by Melanie Clarke:

In working on JP-3610, we noted that there are sometimes unnecessary copies in pipeline steps, e.g. a copy of the input data is made at the top of the step, then another copy is made in the core algorithm when processing begins.

We should review all pipeline steps to make sure that only necessary copies are made, for performance optimization.

stscijgbot-jp · 2024-07-25T17:03:07Z

Comment by Maria Pena-Guerrero on JIRA:

Working on cleanup for steps in Detector1 in #8676

stscijgbot-jp · 2024-07-25T17:13:07Z

Comment by Maria Pena-Guerrero on JIRA:

The steps in Image3 will not be changed as part of this ticket.

stscijgbot-jp · 2024-08-27T14:18:08Z

Comment by Maria Pena-Guerrero on JIRA:

I did a couple of tests with MIRI image files, since these are the most affected by the memory increase. One file is part of our regression tests, jw00001001001_01101_00001_mirimage_uncal.fits, and is about 1 GB in size. On both master and the branch, this file took about 3 min to finish and the maximum memory used was about 11 GB. However, the branch run is slightly faster to both finish and reach the max memory usage. Plots:

branch
!memory_det1_branch_1GB_3min_max11GB.png!

master

!memory_det1_master_1GB_3min_max11GB.png!

The other file was jw01283001001_03101_00001_mirimage_uncal.fits, which has a size of 2.64 GB. On master, this file took 60 min to run with a maximum memory usage of about 25 GB. On the branch, the file took 25 min to run with a maximum memory usage of about 22 GB. Plots:

branch

!memory_det1_new_branch_2GB.png!

master

!memory_det1_new_master_2GB.png!

stscijgbot-jp · 2024-08-27T14:33:08Z

Comment by Maria Pena-Guerrero on JIRA:

For completion, here is the pip freeze of my testing environment:

[^pip_freeze.txt]

stscijgbot-jp · 2024-09-03T17:43:07Z

Comment by Maria Pena-Guerrero on JIRA:

another test with the file I am using to write a memory regression test: jw01024001001_04101_00001_mirimage_uncal.fits

Still, the branch is slightly faster

branch !memory_det1_small_branch.png|thumbnail!

master !memory_det1_small_master.png|thumbnail!

stscijgbot-jp · 2024-09-18T13:58:04Z

Comment by Maria Pena-Guerrero on JIRA:

I think the combined effect of all these PRs dramatically reduced the memory usage.

In the ASDF repo:

In the stpipe repo:

conditionally format log_records stpipe#171

In the stcal repo:

The PR in the jwst repo that removes the unnecessary copies still further improves the memory usage and the running time.

https://github.com/spacetelescope/jwst/pull/8676](https://github.com/spacetelescope/jwst/pull/8676/)

stscijgbot-jp · 2024-09-18T14:18:04Z

Comment by David Law on JIRA:

Is there any way to ballpark what the combined savings after all of these PRs is for some typical data sets? (Might be easiest after the build is tagged and all related things are merged). Similar to https://jira.stsci.edu/browse/JP-3716, it would be useful to be able to advertise the impact of all of this work.

stscijgbot-jp · 2024-09-18T15:28:18Z

Comment by Tyler Pauly on JIRA:

I'm trying and failing to replicate the figures shown for branch/master and the 2GB input file, from program jw01283. I used current jwst/master (which pulls the recent stdatmodels and stcal releases) and file "jw01283001001_03101_00001_mirimage_uncal.fits", and profiled with memray. I see this:

!plot_2gb_master0918.png|thumbnail!

Can you confirm those figures are correct?

stscijgbot-jp · 2024-09-18T15:38:04Z

Comment by Maria Pena-Guerrero on JIRA:

Tyler Pauly can you please give me a pip freeze to compare to what I have?

stscijgbot-jp · 2024-09-18T15:43:06Z

Comment by Tyler Pauly on JIRA:

I've attached the file here: [^jwst_0918_env.txt]

stscijgbot-jp · 2024-09-18T17:23:05Z

Comment by Maria Pena-Guerrero on JIRA:

I have just ran my test again on master for that file. Here are my results with a) memray and b) mprof

a) memray

[^memray-flamegraph-pipe_profile.py.86815.html]

b) mprof

!memory_det1.png|thumbnail!

stscijgbot-jp · 2024-09-18T17:33:06Z

Comment by Maria Pena-Guerrero on JIRA:

Seems like memray keeps track of both the virtual and real memory whereas mprof only follows the real memory usage. In my case the virtual memory rose up to the values we were seeing before, about 66 GB, but the real memory got up to about 23 GB.

stscijgbot-jp · 2024-09-20T17:53:09Z

Comment by Tyler Pauly on JIRA:

Fixed by #8676

stscijgbot-jp added the team-chartreuse label Jul 24, 2024

stscijgbot-jp closed this as completed Sep 20, 2024

stscijgbot-jp added the performance-improvements label Oct 17, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clean up unnecessary copies #8673

Clean up unnecessary copies #8673

stscijgbot-jp commented Jul 24, 2024

stscijgbot-jp commented Jul 25, 2024

stscijgbot-jp commented Jul 25, 2024

stscijgbot-jp commented Aug 27, 2024 •

edited

Loading

stscijgbot-jp commented Aug 27, 2024

stscijgbot-jp commented Sep 3, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 20, 2024

Clean up unnecessary copies #8673

Clean up unnecessary copies #8673

Comments

stscijgbot-jp commented Jul 24, 2024

stscijgbot-jp commented Jul 25, 2024

stscijgbot-jp commented Jul 25, 2024

stscijgbot-jp commented Aug 27, 2024 • edited Loading

stscijgbot-jp commented Aug 27, 2024

stscijgbot-jp commented Sep 3, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 18, 2024

stscijgbot-jp commented Sep 20, 2024

stscijgbot-jp commented Aug 27, 2024 •

edited

Loading