How can we run parametrized strategies with multiprocessing? #351
Replies: 3 comments 3 replies
-
Hey @crazy25000, good find. I'm thinking this has not be handled (yet) in nautilus trader, but definitely on the near term list of things todo. It looks to be an issue with pickling cython classes. I thought we might get lucky and it would be handled by cloudpickle, but unfortunately it doesn't look like it is - cloudpipe/cloudpickle#186 We've got a couple of options, which we will need to discuss:
I'm thinking it's going to be cheaper to do 2., both from a performance and network bandwidth perspective. I'm a big fan of the dask project, which has spent alot of time solving these issues, specifically moving data around and efficiently running tasks, so my first thought would be working within that ecosystem. I'll wait and see what @cjdsellers thinks, but I'll say this is something I'll take a look at very soon. |
Beta Was this translation helpful? Give feedback.
-
Hey @crazy25000 - I've opened #354 after I had a bit of a think about this, would you mind adding your 2c? |
Beta Was this translation helpful? Give feedback.
-
Does Cython's prange with a shared memory solve this issue? I believe things are already converting into C data types so prange may be a viable route to distribute engine runs over multiple cores. https://cython.readthedocs.io/en/latest/src/userguide/parallelism.html I'm not very familiar with C/C++ programming concepts but would a pointer be useful to pass the next result of the data producer generator or even pointing to the entire data producer? Does the fact that a point being resolving the memory address null the shared memory idea? I don't know much about job lib either so I I will check it out to understand what benefits it offers. My sticking point was the sharedmem but a public data type class could resolve this as per the stack overflow post above. Another option I remember from some books I read on Cython is type memory views which could be helpful if going the prange route as the project relies heavily on pandas. https://cython.readthedocs.io/en/latest/src/userguide/memoryviews.html |
Beta Was this translation helpful? Give feedback.
-
Hello,
The idea is to initially load all data into one
engine
instance and use the instance to run parametrized strategies with multiprocessing.Using this example:
nautilus_trader/examples/backtest/fx_ema_cross_gbpusd_bars.py
Lines 46 to 53 in f713b55
I modified it by creating a list of strategies with different params:
I've tried several methods with concurrent futures, multiprocessing, and
joblib
. But was getting the same errorCan't pickle local object '__Pyx_CFunc
and makes me wonder if this is because of Cython or something not aware of. Is there a simple, reproducible example available that someone can share?Beta Was this translation helpful? Give feedback.
All reactions