-
Notifications
You must be signed in to change notification settings - Fork 241
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix java export for huge models #306
base: master
Are you sure you want to change the base?
Conversation
d7d93fe
to
5275380
Compare
0a0070b
to
85545f0
Compare
@Aulust I've updated all project dependencies and fixed CI recently. And now I have enough rights to review and merge PRs in this repo. P.S. Unfortunately, I'm not familiar with Java, but I'll do my best to help with this PR. |
Happy to hear that this project is back to being maintained. I've actually been using it in production this whole time (with this fix applied), so the pull request is still valid and I would like it to be ultimately merged into the master branch. |
This is so cool to hear! 🎉
Many thanks! I'll get familiar with the content of this PR in a few days. |
020de91
to
6ec1033
Compare
Thanks a lot for this PR! Let me share some my initial thoughts. I have only one conceptual question about the implementation. Do we want to enhance the current
I believe this is totally fine given the benefits that new API introduces.
Is it something obsolete from the initial implementation? Looks like that new tests with huge trees are passed in R. I know one awesome R expert and I think I can talk with him about R support in m2cgen. From my point of view, R is the most challenging language to support here :-(.
Do you know any variants? I've spent quite large amount of time reading articles about decision trees inference, but unfortunately haven't found anything useful.
Awesome research and great idea documenting this in the FAQ! Thank you very much! I fully support it and PR is very welcome. Also, linking #152 here as еру related issue.
To be honest, I don't want overcomplicate
I'm not against it. Feel free to propose a PR! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Aulust , I think this update is absolutely amazing and I really appreciate you revisiting this almost 2 years later! Left a few minor comments but looks fantastic otherwise!
@@ -112,6 +115,7 @@ class SubroutinesMixin(BaseToCodeInterpreter): | |||
# disabled by default | |||
ast_size_check_frequency = sys.maxsize | |||
ast_size_per_subroutine_threshold = sys.maxsize | |||
subroutine_per_group_threshold = sys.maxsize |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like subroutine_per_module_threshold
seems more accurate
self._reset_reused_expr_cache() | ||
subroutine = self.subroutine_expr_queue.pop(0) | ||
subroutine_code = self._process_subroutine(subroutine) | ||
subroutines.append((subroutine, subroutine_code)) | ||
|
||
subroutines.sort(key=lambda subroutine: subroutine[0].idx) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do we need this sort given that subroutines are added in a specific order already (the subroutine_expr_queue
)? And a follow-up: why do we need index as part of the Subroutine data structure?
self._reset_reused_expr_cache() | ||
subroutine = self.subroutine_expr_queue.pop(0) | ||
subroutine_code = self._process_subroutine(subroutine) | ||
subroutines.append((subroutine, subroutine_code)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If my assumption in the previous comment is correct I don't think we actually need subroutine
here and having only subroutine_code
will suffice.
@@ -28,3 +28,9 @@ def array_index_access(self, array_name, index): | |||
|
|||
def vector_init(self, values): | |||
return f"c({', '.join(values)})" | |||
|
|||
def module_definition(self, module_name): | |||
raise NotImplementedError("Modules in r is not supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since "Modules" is plural I suggest to you use are not supported
.
raise NotImplementedError("Modules in r is not supported") | ||
|
||
def module_function_invocation(self, module_name, function_name, *args): | ||
raise NotImplementedError("Modules in r is not supported") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto
Hey @Aulust! I think this change is great and will be an amazing addition to the library. Any chance you plan to finish the work on this PR? |
This is a suggestion on how to tackle a problem mentioned at #298 . In my case, I had about 1000 estimators and this fix worked quite well in production.
This fix has some flaws since it brakes backward compatibility in a sense that you would get several class files after compilation. This might be relevant in cases like mine when a custom class loader is used to load new models in runtime without an application restart. A class loader written with only one model class file in mind might not work correctly after this fix.
Another problem comes from R code generator. With huge trees, context stack overflow emerges from nested function calls and I don't know a good fix for that other than refactoring ast tree generation to reduce recursion.
And lastly, I would like to discuss java code performance. Currently, it's quite bad since generated methods are too big to be jit-compiled by default. A simple jmh benchmark as well as checking this in production shows up to 10 times boost in performance with VM flag -XX:-DontCompileHugeMethods. Reducing ast_size_per_subroutine_threshold helps, but it creates more function calls which is expensive (default jvm limit for method inline, FreqInlineSize, is too small to overcome that). After a decent amount of tweaking ast_size_per_subroutine_threshold I was able to get about 80% of -DontCompileHugeMethods performance, but it certainly depends on a particular model.
I see several ways to mitigate this:
Would like to hear your take on it.