Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Issues on Materials metric #177

Open
abachma2 opened this issue Jan 21, 2021 · 0 comments
Open

Memory Issues on Materials metric #177

abachma2 opened this issue Jan 21, 2021 · 0 comments
Assignees

Comments

@abachma2
Copy link
Member

I am using cymetic to analyze the SQLite output of a fairly large Cyclus simulation (the database is about 400 MB). When I use some of the metrics (like 'Materials' and any that rely on it like 'TransactionQuantity') I encounter a MemoryError; it can't allocate a certain amount of memory to an array of the specified size. Amounts of memory that it told me it can't allocate range between 571 MiB-3.91 GiB. I changed my setting to allow over commit memory, but doing this just leads the the kernel dying rather than returning a MemoryError.

I am running 64-bit python3 on a 64-bit Ubuntu 18.04 system with 32 GB of memory.

The error seems to stem from the pd.merge or set_index operation in the 'Materials' metric

---------------------------------------------------------------------------
MemoryError                               Traceback (most recent call last)
<ipython-input-4-91d85b0e404a> in <module>()
----> 1 evaler.eval('Materials')

/home/amandabachmann/.local/lib/python3.6/site-packages/cymetric/evaluator.py in eval(self, metric, conds)
     58             frame = self.eval(dep, conds=conds)
     59             frames.append(frame)
---> 60         raw = m(frames=frames, conds=conds, known_tables=self.known_tables)
     61         if raw is None:
     62             return raw

/home/amandabachmann/.local/lib/python3.6/site-packages/cymetric/metrics.py in __call__(self, frames, conds, known_tables, *args, **kwargs)
     75             if self.name in known_tables:
     76                 return self.db.query(self.name, conds=conds)
---> 77             return f(*frames)
     78 
     79     Cls.__name__ = str(name)

/home/amandabachmann/.local/lib/python3.6/site-packages/cymetric/metrics.py in materials(rsrcs, comps)
    118     x = pd.merge(rsrcs, comps, on=['SimId', 'QualId'], how='inner')
    119     x = x.set_index(['SimId', 'QualId', 'ResourceId', 'ObjId', 'TimeCreated',
--> 120                      'NucId', 'Units'])
    121     y = x['Quantity'] * x['MassFrac']
    122     y.name = 'Mass'

/home/amandabachmann/anaconda3/envs/cyclus-env/lib/python3.6/site-packages/pandas/core/frame.py in set_index(self, keys, drop, append, inplace, verify_integrity)
   4607 
   4608         # clear up memory usage
-> 4609         index._cleanup()
   4610 
   4611         frame.index = index

/home/amandabachmann/anaconda3/envs/cyclus-env/lib/python3.6/site-packages/pandas/core/indexes/base.py in _cleanup(self)
    546 
    547     def _cleanup(self):
--> 548         self._engine.clear_mapping()
    549 
    550     @cache_readonly

pandas/_libs/properties.pyx in pandas._libs.properties.CachedProperty.__get__()

/home/amandabachmann/anaconda3/envs/cyclus-env/lib/python3.6/site-packages/pandas/core/indexes/multi.py in _engine(self)
   1000         if lev_bits[0] > 64:
   1001             # The levels would overflow a 64 bit uint - use Python integers:
-> 1002             return MultiIndexPyIntEngine(self.levels, self.codes, offsets)
   1003         return MultiIndexUIntEngine(self.levels, self.codes, offsets)
   1004 

pandas/_libs/index.pyx in pandas._libs.index.BaseMultiIndexCodesEngine.__init__()

MemoryError: Unable to allocate 3.91 GiB for an array with shape (74887355, 7) and data type int64
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant