Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError with IV2SLS #607

Open
ckarren opened this issue Jun 23, 2024 · 0 comments
Open

MemoryError with IV2SLS #607

ckarren opened this issue Jun 23, 2024 · 0 comments

Comments

@ckarren
Copy link

ckarren commented Jun 23, 2024

I'm trying to run a 2SLS to estimate price elasticity with IV2SLS. This is what my data looks like:
| ln_q | ln_p | .... weather variables ... | ... instruments ... |... user id dummies ...|

all data is np.float32. My data array is approx. (200000, 20000) which is about 16GB.

Using linearmodels IV2SLS I set up my model like:

dependent = ln_q
endog = weather variables + user id dummies
exog = ln_p
instruments = instruments
results = IV2SLS(dependent, endog, exog, instruments).fit()

When running with the full dataset I consistently get the error:
Unable to allocate 27.8GiB of memory to an array with shape (202507, 18450) and data type float64 and it looks like this line is the culprit:
self._wz = self._z * w
which is where weights are assigned.
I'm running 64-bit python on a machine with 128 GB of RAM. I've tried to circumvent this issue by passing my own weights:
results = IV2SLS(dependent, endog, exog, instruments, weights=np.ones(dependent.shape, dtype=np.float32)).fit()
but still get the same MemoryError even when I explicitly pass my own weights of data type float32.
32 GB of RAM usage just to create an array of 1s when weights = None seems like an awful lot of memory usage to essentially keep the input values unchanged. Further, why is it getting recast to float64, when all my other data is of data type float32 and I explicitly pass weights of datatype float32?
Why is an array of ~16GB using >100GB of RAM in this process? What can I do to get this regression to run?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant