Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Memory Leak #386

Open
basejn opened this issue Aug 3, 2017 · 0 comments
Open

Memory Leak #386

basejn opened this issue Aug 3, 2017 · 0 comments

Comments

@basejn
Copy link

basejn commented Aug 3, 2017

SFrame is stated to deal with large amounts of data without using mush RAM , but it leaks memory on simple tasks.

This sample of code continues to increase the RAM usage forever , and is strangely slow.
The speed can be explained with the disk storage that the library uses to deal with large sets.

import sframe as sf
data = sf.SFrame({'a':['string']*1000,'b':[1]*1000,'c':[{'key1':1}]*1000})
for i in xrange(10000):
    a = data.to_numpy()

Another example is:

suma=0
for i in xrange(10000):
    for row in datain:
        suma+=row['b']

The RAM usage steadily increases.

This are just samples , not real usage.

The thing that i am trying to accomplish with the library is to read the data from the SFrame one by one or batch by batch and agregate it without loading it in RAM .Actually to construct a sparse matrix which i will use for training with Gradient Descent.

If i iterate it batch by batch , after the iteration SFrame uses a lot of ram and doesn't release it. It uses no less memory that the real size of the data so using it becomes pointless.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant