Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhance PCA Decomposition #476

Open
8 tasks
bbengfort opened this issue Jun 14, 2018 · 12 comments
Open
8 tasks

Enhance PCA Decomposition #476

bbengfort opened this issue Jun 14, 2018 · 12 comments
Assignees
Labels
level: intermediate python coding expertise required priority: medium can wait until after next release type: feature a new visualizer or utility for yb

Comments

@bbengfort
Copy link
Member

bbengfort commented Jun 14, 2018

We should enhance the PCADecomposition visualizer to provide many of the features the Manifold visualizer provides, including things like:

  • Color points by class with a legend (See Add legend option to PCA visualizer  #458)
  • Color points by heatmap for continuous y and add a colorbar
  • Add alpha parameter (see Add alpha support for scatter plots #475)
  • Add random state to pass to PCA
  • Allow user to pass in a PCA transformer/pipeline
  • Update tests with better random data sets (more points; see manifold tests)
  • Include explained variance/noise variance (or explained variance ratio) in chart
  • Enhance biplots documentation

See also #455 as another enhancement that might not be related to this enhancement.

@bbengfort bbengfort added type: feature a new visualizer or utility for yb priority: medium can wait until after next release level: intermediate python coding expertise required labels Jun 14, 2018
@rohit-ganapathy
Copy link
Contributor

Hey! i'm interested in tackling this.

@bbengfort
Copy link
Member Author

@rohit-ganapathy - that would be great, feel free to open a PR when you're ready for us to take a look.

@dnabanita7
Copy link
Contributor

Can I start working on this,even if @rohit-ganapathy is assigned?

@rebeccabilbro
Copy link
Member

Hello @naba7 — as we explained last week in response to your questions on #738 and #677, we do not "assign" issues or reserve issues for contributors. Anyone is welcome to submit a PR for a feature or bugfix they work on.

However, given that you already have one PR open that still needs to be completed (#755), have started working on #615, and are new to working on Yellowbrick and still getting to know our API, we would really appreciate if you would focus on getting those first PRs across the finish line before starting anything new.

We appreciate your enthusiasm about contributing to Yellowbrick. One of the most important lessons to learn is that open source is a marathon, not a sprint, so we hope you can be patient and enjoy the journey — we promise Yellowbrick isn't going away!

@dnabanita7
Copy link
Contributor

dnabanita7 commented Feb 19, 2019 via email

@bbengfort
Copy link
Member Author

@naresh-bachwani has this issue been fixed by your work this summer?

@bbengfort bbengfort reopened this Aug 29, 2019
@naresh-bachwani
Copy link
Contributor

@bbengfort I think that explained variance charts are left! But that will be covered in decomposition, right?

@bbengfort
Copy link
Member Author

@naresh-bachwani ExplainedVariance is separate to this issue. Would you mind ticking the checkboxes above based on your work?

@BradKML
Copy link

BradKML commented Mar 12, 2022

Can these functions be applied to FastICA in Scikit-Learn (or maybe any ICA)?
Also observing #615 and #316

@bbengfort
Copy link
Member Author

@BrandonKMLee very possibly, it wouldn't hurt to try. I think what you'd have to do is change the pca_transformer attribute on the PCA visualizer; establishing it as a pipeline similar to the code here: https://github.com/DistrictDataLabs/yellowbrick/blob/develop/yellowbrick/features/pca.py#L184-L189. This would have to be done after initialization before any call to fit or transform. I don't see any place it wouldn't work, unless FastICA or ICA doesn't have required attributes like n_components_.

You could also try passing an initialized FastICA or ICA transformer as the manifold attribute to the Manifold visualizer - this might not give you the same features as ICA, but should give you the projected visualization.

@BradKML
Copy link

BradKML commented Mar 12, 2022

@bbengfort n_components_in_ for FastICA, but at the same time explained variance could be a problem, as each components are expected to have well-distributed significance instead of being ordered, and also such a function currently does not exist for FastICA.

@bbengfort
Copy link
Member Author

@BrandonKMLee ok, that makes sense so potentially FastICA make not work unless we create a specialized manifold for them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
level: intermediate python coding expertise required priority: medium can wait until after next release type: feature a new visualizer or utility for yb
Projects
None yet
Development

No branches or pull requests

7 participants