Memory leak due to LRU cache in method of EphysNWBData #495

gouwens · 2021-01-22T20:38:14Z

Describe the bug
The LRU cache on the _get_series() method of EphysNWBData causes a memory leak because self is passed to the cache, meaning the object can never be let go. This is an issue at least for the MIESNWBData subclass because it has an instance variable notebook (usually a LabNotebookReaderIgorNwb), which has a few large numpy arrays that eventually use a great deal of memory.

The place in the code where that happens is here:

ipfx/ipfx/dataset/ephys_nwb_data.py

Line 111 in 75a3ea7

@lru_cache(maxsize=None)

See https://stackoverflow.com/questions/33672412/python-functools-lru-cache-with-class-methods-release-object for information about the issues with using @lru_cache inside classes. There are a couple of strategies for handling this issue discussed in that post - I haven't though about what is the best option in this case, though.

At the moment, I think I am working around it by manually flushing the cache when I'm done with the data set object, e.g.

    if hasattr(data_set._data, "_get_series"):
        print("clearing LRU cache?")
        data_set._data._get_series.cache_clear()

But I think that's probably too much to expect a typical user to know about and implement.

The text was updated successfully, but these errors were encountered:

kasbaker · 2021-02-02T06:25:26Z

Thanks for reporting this @gouwens. The @lru_cache decorator speeds things up a lot, but if it is causing a memory leak then that is definitely an issue we should fix. Do you have some code that I can use to reproduce this bug? I tried modifying some of your code from #494 to see if I could find a difference in memory usage before and after commenting out the @lru_cache decorator:

data_set = create_ephys_data_set(
    nwb_file=nwb_file, ontology=StimulusOntology.DEFAULT_STIMULUS_ONTOLOGY_FILE
)

for _ in range(5):
    for num in data_set._data.sweep_numbers:
        my_sweep = data_set.sweep(num)
    all_objects = muppy.get_objects()
    sum1 = summary.summarize(all_objects)
    # Prints out a summary of the large objects
    summary.print_(sum1)

Here is the output without the @lru_cache decorator:

                       types |   # objects |   total size
============================ | =========== | ============
                 _io.BytesIO |           1 |     18.11 MB
                         str |       78237 |     15.61 MB
                        dict |       28907 |     10.16 MB
               numpy.ndarray |          71 |      3.62 MB
                        code |       23288 |      3.22 MB
                        type |        3254 |      2.90 MB
                         set |        7289 |      1.94 MB
                       tuple |       26952 |      1.76 MB
                        list |       10501 |      1.16 MB
                     weakref |        7278 |    568.59 KB
                        cell |       10369 |    486.05 KB
                         int |        9414 |    275.89 KB
          wrapper_descriptor |        3191 |    249.30 KB
           getset_descriptor |        3384 |    237.94 KB
  builtin_function_or_method |        3082 |    216.70 KB
               types |   # objects |   total size
==================== | =========== | ============
         _io.BytesIO |           1 |     18.11 MB
                 str |       91556 |     16.54 MB
                dict |       28923 |     10.16 MB
                list |       23822 |      4.65 MB
       numpy.ndarray |          74 |      3.72 MB
                code |       23288 |      3.22 MB
                type |        3267 |      2.90 MB
                 set |        7289 |      1.94 MB
               tuple |       26982 |      1.76 MB
             weakref |        7293 |    569.77 KB
                cell |       10378 |    486.47 KB
                 int |       12630 |    363.88 KB
  wrapper_descriptor |        3218 |    251.41 KB
   getset_descriptor |        3384 |    237.94 KB
   method_descriptor |        3089 |    217.20 KB
               types |   # objects |   total size
==================== | =========== | ============
         _io.BytesIO |           1 |     18.11 MB
                 str |      104864 |     17.47 MB
                dict |       28925 |     10.16 MB
                list |       37132 |      8.43 MB
       numpy.ndarray |          77 |      3.82 MB
                code |       23288 |      3.22 MB
                type |        3267 |      2.90 MB
                 set |        7289 |      1.94 MB
               tuple |       26982 |      1.76 MB
             weakref |        7293 |    569.77 KB
                cell |       10378 |    486.47 KB
                 int |       15824 |    451.22 KB
  wrapper_descriptor |        3218 |    251.41 KB
   getset_descriptor |        3384 |    237.94 KB
   method_descriptor |        3089 |    217.20 KB
               types |   # objects |   total size
==================== | =========== | ============
                 str |      118172 |     18.41 MB
         _io.BytesIO |           1 |     18.11 MB
                list |       50442 |     12.52 MB
                dict |       28927 |     10.16 MB
       numpy.ndarray |          80 |      3.91 MB
                code |       23288 |      3.22 MB
                type |        3267 |      2.90 MB
                 set |        7289 |      1.94 MB
               tuple |       26982 |      1.76 MB
             weakref |        7293 |    569.77 KB
                 int |       19016 |    538.50 KB
                cell |       10378 |    486.47 KB
  wrapper_descriptor |        3218 |    251.41 KB
   getset_descriptor |        3384 |    237.94 KB
   method_descriptor |        3089 |    217.20 KB
               types |   # objects |   total size
==================== | =========== | ============
                 str |      131480 |     19.34 MB
         _io.BytesIO |           1 |     18.11 MB
                list |       63752 |     16.62 MB
                dict |       28929 |     10.17 MB
       numpy.ndarray |          83 |      4.01 MB
                code |       23288 |      3.22 MB
                type |        3267 |      2.90 MB
                 set |        7289 |      1.94 MB
               tuple |       26982 |      1.76 MB
                 int |       22208 |    625.78 KB
             weakref |        7293 |    569.77 KB
                cell |       10378 |    486.47 KB
  wrapper_descriptor |        3218 |    251.41 KB
   getset_descriptor |        3384 |    237.94 KB
   method_descriptor |        3089 |    217.20 KB

Here is the output with the @lru_cache decorator:

                       types |   # objects |   total size
============================ | =========== | ============
                 _io.BytesIO |           1 |     18.11 MB
                         str |       78237 |     15.61 MB
                        dict |       28910 |     10.16 MB
               numpy.ndarray |          71 |      3.62 MB
                        code |       23288 |      3.22 MB
                        type |        3254 |      2.90 MB
                         set |        7289 |      1.94 MB
                       tuple |       27056 |      1.77 MB
                        list |       10501 |      1.16 MB
                     weakref |        7278 |    568.59 KB
                        cell |       10369 |    486.05 KB
                         int |        9414 |    275.88 KB
          wrapper_descriptor |        3191 |    249.30 KB
           getset_descriptor |        3384 |    237.94 KB
  builtin_function_or_method |        3082 |    216.70 KB
               types |   # objects |   total size
==================== | =========== | ============
         _io.BytesIO |           1 |     18.11 MB
                 str |       91556 |     16.54 MB
                dict |       28926 |     10.17 MB
                list |       23822 |      4.65 MB
       numpy.ndarray |          74 |      3.72 MB
                code |       23288 |      3.22 MB
                type |        3267 |      2.90 MB
                 set |        7289 |      1.94 MB
               tuple |       27086 |      1.77 MB
             weakref |        7293 |    569.77 KB
                cell |       10378 |    486.47 KB
                 int |       12630 |    363.88 KB
  wrapper_descriptor |        3218 |    251.41 KB
   getset_descriptor |        3384 |    237.94 KB
   method_descriptor |        3089 |    217.20 KB
               types |   # objects |   total size
==================== | =========== | ============
         _io.BytesIO |           1 |     18.11 MB
                 str |      104864 |     17.48 MB
                dict |       28928 |     10.17 MB
                list |       37132 |      8.43 MB
       numpy.ndarray |          77 |      3.82 MB
                code |       23288 |      3.22 MB
                type |        3267 |      2.90 MB
                 set |        7289 |      1.94 MB
               tuple |       27086 |      1.77 MB
             weakref |        7293 |    569.77 KB
                cell |       10378 |    486.47 KB
                 int |       15824 |    451.21 KB
  wrapper_descriptor |        3218 |    251.41 KB
   getset_descriptor |        3384 |    237.94 KB
   method_descriptor |        3089 |    217.20 KB
               types |   # objects |   total size
==================== | =========== | ============
                 str |      118172 |     18.41 MB
         _io.BytesIO |           1 |     18.11 MB
                list |       50442 |     12.52 MB
                dict |       28930 |     10.17 MB
       numpy.ndarray |          80 |      3.91 MB
                code |       23288 |      3.22 MB
                type |        3267 |      2.90 MB
                 set |        7289 |      1.94 MB
               tuple |       27086 |      1.77 MB
             weakref |        7293 |    569.77 KB
                 int |       19016 |    538.50 KB
                cell |       10378 |    486.47 KB
  wrapper_descriptor |        3218 |    251.41 KB
   getset_descriptor |        3384 |    237.94 KB
   method_descriptor |        3089 |    217.20 KB
               types |   # objects |   total size
==================== | =========== | ============
                 str |      131480 |     19.34 MB
         _io.BytesIO |           1 |     18.11 MB
                list |       63752 |     16.62 MB
                dict |       28932 |     10.17 MB
       numpy.ndarray |          83 |      4.01 MB
                code |       23288 |      3.22 MB
                type |        3267 |      2.90 MB
                 set |        7289 |      1.94 MB
               tuple |       27086 |      1.77 MB
                 int |       22208 |    625.78 KB
             weakref |        7293 |    569.77 KB
                cell |       10378 |    486.47 KB
  wrapper_descriptor |        3218 |    251.41 KB
   getset_descriptor |        3384 |    237.94 KB
   method_descriptor |        3089 |    217.20 KB

I don't see much of a difference between the two, but they both seem to have a memory leak. Do you think this is related to #494?

gouwens · 2021-02-02T18:18:25Z

I think there could be a couple of things going on there. It could be that the other memory leak in #494 is still causing issues even with lru_cache commented out. It could also be that the increase in memory usage in the commented-out case could be an artifact of the way the muppy code is implemented.

Here's code where I see a clear difference in memory usage with and without commenting out lru_cache (here I'm loading multiple NWB2 files, but I also see the same thing loading the same file multiple times).

# Setup
from ipfx.stimulus import StimulusOntology
import allensdk.core.json_utilities as ju
from ipfx.dataset.mies_nwb_data import MIESNWBData
from ipfx.dataset.labnotebook import LabNotebookReaderIgorNwb
from pympler import muppy, summary

ontology = StimulusOntology(ju.read(StimulusOntology.DEFAULT_STIMULUS_ONTOLOGY_FILE))

# example nwb2 files
nwb_file_list = [
    '/allen/programs/celltypes/production/mousecelltypes/prod176/Ephys_Roi_Result_628543361/nwb2_Scnn1a-Tg2-Cre;Ai14-346639.04.02.01.nwb',
    '/allen/programs/celltypes/production/mousecelltypes/prod2457/Ephys_Roi_Result_998064513/nwb2_Vip-IRES-Cre;Ai14-504181.07.02.01.nwb',
    '/allen/programs/celltypes/production/mousecelltypes/prod2480/Ephys_Roi_Result_1000110850/nwb2_Esr2-IRES2-Cre;Ai14-506384.03.02.01.nwb',
    '/allen/programs/celltypes/production/mousecelltypes/prod2481/Ephys_Roi_Result_1000125224/nwb2_Esr2-IRES2-Cre;Ai14-506384.03.02.02.nwb',
]


# function to load & return a data set object
def load_data_set(nwb_path, ontology, load_into_memory):
    labnotebook = LabNotebookReaderIgorNwb(nwb_file)
    data_set = MIESNWBData(
        nwb_file=nwb_path,
        notebook=labnotebook,
        ontology=ontology,
        load_into_memory=load_into_memory
    )
    return data_set

# Keep memory examination code isolated in its own function
def summarize_memory():
    all_objects = muppy.get_objects()
    sum1 = summary.summarize(all_objects)
    summary.print_(sum1)


for nwb_file in nwb_file_list:
    ds = load_data_set(nwb_file, ontology, load_into_memory=False) # working around the #494 leak
    for num in ds.sweep_numbers:
        my_sweep_data = ds.get_sweep_data(num)
    summarize_memory()

With this code, I see this when lru_cache is intact (note the increase in memory usage in the numpy.ndarray):

                       types |   # objects |   total size
============================ | =========== | ============
                         str |       66735 |     10.84 MB
                        dict |       24054 |      8.56 MB
               numpy.ndarray |          46 |      3.52 MB
                        code |       17994 |      2.48 MB
                        type |        2630 |      2.24 MB
                       tuple |       20702 |      1.52 MB
                         set |        4676 |      1.40 MB
                        list |        8321 |      1.01 MB
                        cell |        9795 |    535.66 KB
                     weakref |        5694 |    489.33 KB
  builtin_function_or_method |        3840 |    300.00 KB
          wrapper_descriptor |        2984 |    256.44 KB
           getset_descriptor |        2796 |    218.44 KB
           method_descriptor |        2722 |    212.66 KB
                         int |        6345 |    196.46 KB
                               types |   # objects |   total size
==================================== | =========== | ============
                       numpy.ndarray |          50 |     35.56 MB
                                 str |       70292 |     11.26 MB
                                dict |       30871 |     10.39 MB
                                code |       17990 |      2.48 MB
                                type |        2630 |      2.24 MB
                                 set |        6756 |      1.88 MB
                               tuple |       21774 |      1.59 MB
                                list |        9233 |      1.09 MB
                             weakref |        7894 |    678.39 KB
                                cell |        9645 |    527.46 KB
  hdmf.build.builders.DatasetBuilder |        1442 |    371.77 KB
          builtin_function_or_method |        4104 |    320.62 KB
                                 int |        9561 |    300.32 KB
           pynwb.spec.NWBDatasetSpec |         919 |    284.18 KB
                  wrapper_descriptor |        2985 |    256.52 KB
                               types |   # objects |   total size
==================================== | =========== | ============
                       numpy.ndarray |          54 |     51.17 MB
                                dict |       35799 |     11.82 MB
                                 str |       72453 |     11.53 MB
                                code |       17990 |      2.48 MB
                                 set |        8836 |      2.36 MB
                                type |        2630 |      2.24 MB
                               tuple |       22515 |      1.64 MB
                                list |       10065 |      1.18 MB
                             weakref |        9247 |    794.66 KB
                                cell |        9646 |    527.52 KB
  hdmf.build.builders.DatasetBuilder |        1996 |    514.59 KB
           pynwb.spec.NWBDatasetSpec |        1234 |    381.14 KB
                                 int |       11503 |    363.55 KB
          builtin_function_or_method |        4271 |    333.67 KB
        hdmf.spec.spec.AttributeSpec |        1158 |    298.55 KB
                               types |   # objects |   total size
==================================== | =========== | ============
                       numpy.ndarray |          58 |     88.59 MB
                                dict |       42179 |     13.45 MB
                                 str |       75736 |     11.94 MB
                                 set |       10916 |      2.84 MB
                                code |       17990 |      2.48 MB
                                type |        2630 |      2.24 MB
                               tuple |       23619 |      1.71 MB
                                list |       10963 |      1.26 MB
                             weakref |       11227 |    964.82 KB
  hdmf.build.builders.DatasetBuilder |        2847 |    733.99 KB
                                cell |        9647 |    527.57 KB
           pynwb.spec.NWBDatasetSpec |        1549 |    478.10 KB
                                 int |       14435 |    458.71 KB
        hdmf.spec.spec.AttributeSpec |        1446 |    372.80 KB
          builtin_function_or_method |        4504 |    351.88 KB

And this is what I see when lru_cache is commented out (the memory usage changes because the files are different, but the number of ndarrays doesn't keep going up:

                       types |   # objects |   total size
============================ | =========== | ============
                         str |       66733 |     10.84 MB
                        dict |       24051 |      8.55 MB
               numpy.ndarray |          46 |      3.52 MB
                        code |       17994 |      2.48 MB
                        type |        2630 |      2.24 MB
                       tuple |       20598 |      1.51 MB
                         set |        4676 |      1.40 MB
                        list |        8321 |      1.01 MB
                        cell |        9795 |    535.66 KB
                     weakref |        5694 |    489.33 KB
  builtin_function_or_method |        3840 |    300.00 KB
          wrapper_descriptor |        2984 |    256.44 KB
           getset_descriptor |        2796 |    218.44 KB
           method_descriptor |        2722 |    212.66 KB
                         int |        6345 |    196.46 KB
                               types |   # objects |   total size
==================================== | =========== | ============
                       numpy.ndarray |          46 |     32.04 MB
                                 str |       68249 |     11.02 MB
                                dict |       26114 |      9.20 MB
                                code |       17990 |      2.48 MB
                                type |        2630 |      2.24 MB
                               tuple |       20883 |      1.53 MB
                                 set |        4676 |      1.40 MB
                                list |        8407 |      1.01 MB
                             weakref |        6631 |    569.85 KB
                                cell |        9644 |    527.41 KB
          builtin_function_or_method |        3949 |    308.52 KB
                  wrapper_descriptor |        2985 |    256.52 KB
                                 int |        7739 |    241.37 KB
  hdmf.build.builders.DatasetBuilder |         924 |    238.22 KB
                   getset_descriptor |        2796 |    218.44 KB
                       types |   # objects |   total size
============================ | =========== | ============
               numpy.ndarray |          46 |     15.62 MB
                         str |       66843 |     10.86 MB
                        dict |       24218 |      8.85 MB
                        code |       17990 |      2.48 MB
                        type |        2630 |      2.24 MB
                       tuple |       20513 |      1.51 MB
                         set |        4676 |      1.40 MB
                        list |        8325 |      1.01 MB
                        cell |        9644 |    527.41 KB
                     weakref |        5786 |    497.23 KB
  builtin_function_or_method |        3852 |    300.94 KB
          wrapper_descriptor |        2985 |    256.52 KB
           getset_descriptor |        2796 |    218.44 KB
           method_descriptor |        2734 |    213.59 KB
                         int |        6467 |    200.81 KB
                               types |   # objects |   total size
==================================== | =========== | ============
                       numpy.ndarray |          46 |     37.42 MB
                                 str |       67965 |     11.01 MB
                                dict |       25670 |      9.13 MB
                                code |       17990 |      2.48 MB
                                type |        2630 |      2.24 MB
                               tuple |       20810 |      1.52 MB
                                 set |        4676 |      1.40 MB
                                list |        8391 |      1.01 MB
                             weakref |        6413 |    551.12 KB
                                cell |        9644 |    527.41 KB
          builtin_function_or_method |        3918 |    306.09 KB
                  wrapper_descriptor |        2985 |    256.52 KB
                                 int |        7457 |    232.74 KB
  hdmf.build.builders.DatasetBuilder |         851 |    219.40 KB
                   getset_descriptor |        2796 |    218.44 KB

kasbaker · 2021-02-03T02:14:32Z

Thanks for the code @gouwens. I changed the import package from functools to methodtools on line 3 and it fixed the leak:

ipfx/ipfx/dataset/ephys_nwb_data.py

Line 3 in 75a3ea7

from functools import lru_cache

Verification:

from warnings import filterwarnings
from time import time
# Setup
from ipfx.stimulus import StimulusOntology
import allensdk.core.json_utilities as ju
from ipfx.dataset.mies_nwb_data import MIESNWBData
from ipfx.dataset.labnotebook import LabNotebookReaderIgorNwb
from pympler import muppy, summary

filterwarnings("ignore", category=UserWarning)

ontology = StimulusOntology(ju.read(StimulusOntology.DEFAULT_STIMULUS_ONTOLOGY_FILE))

# example nwb2 files
nwb_file_list = [
    '/allen/programs/celltypes/production/mousecelltypes/prod176/Ephys_Roi_Result_628543361/nwb2_Scnn1a-Tg2-Cre;Ai14-346639.04.02.01.nwb',
    '/allen/programs/celltypes/production/mousecelltypes/prod2457/Ephys_Roi_Result_998064513/nwb2_Vip-IRES-Cre;Ai14-504181.07.02.01.nwb',
    '/allen/programs/celltypes/production/mousecelltypes/prod2480/Ephys_Roi_Result_1000110850/nwb2_Esr2-IRES2-Cre;Ai14-506384.03.02.01.nwb',
    '/allen/programs/celltypes/production/mousecelltypes/prod2481/Ephys_Roi_Result_1000125224/nwb2_Esr2-IRES2-Cre;Ai14-506384.03.02.02.nwb',
]

# function to load & return a data set object
def load_data_set(nwb_path, ontology, load_into_memory):
    labnotebook = LabNotebookReaderIgorNwb(nwb_file)
    data_set = MIESNWBData(
        nwb_file=nwb_path,
        notebook=labnotebook,
        ontology=ontology,
        load_into_memory=load_into_memory
    )
    return data_set

# Keep memory examination code isolated in its own function
def summarize_memory(data_type: str = ""):
    mem_summary = summary.summarize(muppy.get_objects())
    output = [elem for elem in mem_summary if elem[0] == data_type]
    if output:
        summary.print_(output)
    else:
        summary.print_(mem_summary)

start_time = time()

for nwb_file in nwb_file_list:
    # working around the #494 leak
    ds = load_data_set(nwb_file, ontology, load_into_memory=False)
    for _ in range(2): # repeat this twice to make sure that caching still works
        for num in ds.sweep_numbers:
            my_sweep_data = ds.get_sweep_data(num)
    summarize_memory("numpy.ndarray") # numpy arrays are the biggest objects

print(f"\nTime elapsed: {time()-start_time} s")

And console output:

          types |   # objects |   total size
=============== | =========== | ============
  numpy.ndarray |          46 |      3.52 MB
          types |   # objects |   total size
=============== | =========== | ============
  numpy.ndarray |          46 |     32.04 MB
          types |   # objects |   total size
=============== | =========== | ============
  numpy.ndarray |          46 |     15.62 MB
          types |   # objects |   total size
=============== | =========== | ============
  numpy.ndarray |          46 |     37.42 MB

Time elapsed: 20.316800832748413 s

Good catch! I'll put in a PR to patch this bug soon.

kasbaker · 2021-02-03T18:57:43Z

The PR is up here: #497! @sgratiy, could you please review it?

gouwens added the bug label Jan 22, 2021

kasbaker mentioned this issue Feb 3, 2021

495/mem leak lru cache #497

Merged

12 tasks

wbwakeman added this to the Marmot 2021-02-23 milestone Feb 23, 2021

wbwakeman closed this as completed Feb 23, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Memory leak due to LRU cache in method of EphysNWBData #495

Memory leak due to LRU cache in method of EphysNWBData #495

gouwens commented Jan 22, 2021 •

edited

Loading

kasbaker commented Feb 2, 2021

gouwens commented Feb 2, 2021 •

edited

Loading

kasbaker commented Feb 3, 2021

kasbaker commented Feb 3, 2021

Memory leak due to LRU cache in method of EphysNWBData #495

Memory leak due to LRU cache in method of EphysNWBData #495

Comments

gouwens commented Jan 22, 2021 • edited Loading

kasbaker commented Feb 2, 2021

gouwens commented Feb 2, 2021 • edited Loading

kasbaker commented Feb 3, 2021

kasbaker commented Feb 3, 2021

gouwens commented Jan 22, 2021 •

edited

Loading

gouwens commented Feb 2, 2021 •

edited

Loading