-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: invert power transform/scaling order #207
Conversation
On main we power transform then standard scale the results, matching sklearn. This PR tries out scaling then power transforming instead.
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the You can disable this status message by setting the Thank you for using CodeRabbit. We offer it for free to the OSS community and would appreciate your support in helping us grow. If you find it useful, would you consider giving us a shout-out on your favorite social media? 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
Note: Be mindful of the bot's finite context window. It's strongly recommended to break down tasks such as reading entire modules into smaller chunks. For a focused discussion, use review comments to chat about specific files and their changes, instead of using the PR comments. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
I think we should make this flexible, and use an enum indicating if we should standardize before, after, or not at all. |
Allow users to specify if they do not want to scale their data, or to scale it either before or after doing a power transformation. This allows both matching the sklearn behavior of scaling the data after the transformation, or scaling it before the transformation which can help with data that floors at non-zero values.
d19973b
to
7a8586f
Compare
If we scale before transforming then some values may become negative and a Box Cox transformation is no longer valid. This commit waits to choose which transformation algorithm to use until after potentially scaling the data.
Superseded by #213 which makes it easier for callers to standardize before/after power transforming. |
On
main
we power transform then standard scale the results, matching sklearn. This PR tries out scaling then power transforming instead.