Skip to content
This repository has been archived by the owner on Dec 18, 2023. It is now read-only.

Marginal diagnostic tool #1691

Closed
wants to merge 7 commits into from
Closed

Marginal diagnostic tool #1691

wants to merge 7 commits into from

Conversation

ndmlny-qs
Copy link
Contributor

@ndmlny-qs ndmlny-qs commented Sep 21, 2022

This commit includes the marginal 1D diagnostic tool with JavaScript callbacks.

Motivation

This PR completes one tool that uses Bokeh and JavaScript callbacks in order to create an interactive tool that can be viewed in Jupyter. This refactors the code in PR #1631 heavily, since pure Python callbacks were found to not function properly with internal tools.

Changes proposed

A new tool folder in the diagnostics folder contains the proposed changes. In this folder there is a js folder that contains all the JavaScript callbacks needed for the Bokeh tool. The tool creates plots of marginal distributions for each random variable of the model. The output is a self-contained HTML object that can be rendered in Jupyter without any external CDN calls for JS resources.

Test Plan

Unit tests for the Python and JavaScript will be done at a later commit. Right now the testing was to run the tool in the Coin Flipping tutorial, and to inspect the output and ensure only static resources were used.

Types of changes

  • Docs change / refactoring / dependency upgrade
  • Bug fix (non-breaking change which fixes an issue)
  • New feature (non-breaking change which adds functionality)
  • Breaking change (fix or feature that would cause existing functionality to change)

Checklist

  • My code follows the code style of this project.
  • My change requires a change to the documentation.
  • I have updated the documentation accordingly.
  • I have read the CONTRIBUTING document.
  • I have added tests to cover my changes.
  • All new and existing tests passed.
  • The title of my pull request is a short description of the requested changes.

TODO

  • Python unit tests
  • JavaScript unit tests
  • Figure out if the build should run npm run build for the tools, or if we should just have the minified code for the JS callbacks in the code base.

This commit includes the marginal 1D diagnostic tool with JavaScript
callbacks.
@ndmlny-qs ndmlny-qs added the enhancement New feature or request label Sep 21, 2022
@ndmlny-qs ndmlny-qs self-assigned this Sep 21, 2022
@review-notebook-app
Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Sep 21, 2022
@ndmlny-qs
Copy link
Contributor Author

@horizon-blue you will need to run npm install and npm run build in the new js folder. Ping me when you are going to try it out, and I can help ensure the install works.

@facebook-github-bot
Copy link
Contributor

@horizon-blue has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

- Removes ArviZ as a dependency from the diagnostic tool.
- Adds jsdoc Documentation for the TypeScript methods.
- Fixes flake8 linting error.
- Adds clarifying comments to the tool.
- Adds missing copyright notice.
@facebook-github-bot
Copy link
Contributor

@ndmlny-qs has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@horizon-blue has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@horizon-blue
Copy link
Contributor

The task of getting Node to work with our internal Buck ends up taking a bit longer than I initially expected. I think I just found a way to install node modules with our internal tools, but haven't got the time to try getting webpack to work or get compiled JS into the Python library (will probably try those out tomorrow) :).

In the meantime, could you help updating three items here?

  1. Could you update the license header for the TypeScript files so they use the multi-line comments (/* */) instead of single line comments (//)? i.e. it should be the same as the license header in the React component you previously wrote
  2. It looks like internally we have better support for Yarn rather than NPM, and for Yarn, I believe the typical convention is to have the yarn.lock file check in to the repository. Could you try generating an yarn.lock and add it to this PR?
  3. Let's update our .gitignore to exclude node_modules from being tracked so we don't have to worry about accidentally committing a ton of files :)

@facebook-github-bot
Copy link
Contributor

@ndmlny-qs has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@horizon-blue has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@ndmlny-qs
Copy link
Contributor Author

  • Webpack is a dependency for docusaurus, so I wonder if there is some other issue you are facing with regard to the building of the tool. Technically we could use bokeh to build the distribution files, I just didn't know how to configure it for the build process, but I knew how to do it for webpack.
  • Yarn built the files, and I tested locally on a notebook to ensure it worked.
  • We could prevent my forgetfulness about style issues, linting problems, and not using ufmt for future commits using pre-commit. Let me know if you've used this tool before, and if it is available. It does a lot of amazing things.

- Aligns type names between Python and TypeScript. There is now a
  one-to-one correspondence between Python TypedDict objects and the
  TypeScript interface objects.
- Redundancy in the Python TypedDict objects has been removed.
- Fixes tooltips not rendering correctly.
- Updates the link in the Help Tab of the tool such that when clicked it
  opens a new tab in the browser.
- Moves a few peerDependencies to the devDependencies section in the
  package.json file. This was done because yarn issued a warning
  indicating there were missing peer dependencies.
- Refactors the tool such that no calculations for the data displayed is
  done in Python. All updates and all calculations for the tool data is
  done exclusively in the browser. The rational for this change was to
  prevent users from having initial data calculated by Python, and
  subsequent updates calculated by JavaScript in the browser when
  interacting with the tool. There are notes in the code indicating why
  this change was made.
@facebook-github-bot
Copy link
Contributor

@ndmlny-qs has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@horizon-blue has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@horizon-blue
Copy link
Contributor

Okay, looks like I managed to get it to work in Bento... to some extent :). For some reason no plot is rendered when the widget is first loaded and the console says "[bokeh] could not set initial ranges"

initially no plot

Though after selecting any of the query from the drop down menu, the plot shows up just fine.

plot shows up after selecting random variable

I tried pulling your latest change to my own machine and observe the same issue with my classic Jupyter notebook, which leads me to wonder if it's a but in the code rather than a Bento issue. @ndmlny-qs do you mind giving this a look next week to see if there's a bug in your newest version of the widget?

@horizon-blue horizon-blue mentioned this pull request Sep 30, 2022
11 tasks
@ndmlny-qs ndmlny-qs changed the title [WIP] Marginal diagnostic tool Marginal diagnostic tool Sep 30, 2022
@ndmlny-qs
Copy link
Contributor Author

@horizon-blue let me know what I need to modify for the internal tests to green light. Thanks again, you are super spectacularly awesome!

@horizon-blue
Copy link
Contributor

@ndmlny-qs As I posted last Friday, I think there might be a bug in your latest version of the widget that makes it fail to fully render when it first loads up. I tried running the widget on my local Jupyter notebook session and it has the similar issue, so I believe this is not caused by our internal config. Could you check and see if you can reproduce the same issue on your side?

@ndmlny-qs
Copy link
Contributor Author

Ah yes, for some reason I did not read your post last Friday. This unfortunately is a technical hurdle that we cannot overcome due to the version of Bokeh we are using. I did investigate finding an onload method in Bokeh, but it does not exist and an initialization method is being implemented in Bokeh 3.0.

Relevant issues and PRs from Bokeh

The reason why I actively chose to do what you see (a blank tool) on first render, is that I could not figure out a way to get Python to issue a callback on the tool without user interaction. I wanted Python to make the tool, but not to calculate any data to bind to the glyphs in the tool. This is why the console says "[bokeh] could not set initial ranges", it is because BokehJS has no data in the initial render, and as such, it cannot bound the x- or y-axis so the figures are blank.

I know this may sound like an absurd thing to do, but if I didn't do it this way then Python would calculate the initial values for the glyphs and render them to the tool with an initial view just fine. This is great, but when the browser took over running calculations due to user interactions, the hand-off between Python and JavaScript would cause a significant flicker in the rendered tool that is quite jarring. The only way around this flicker was to prevent Python from doing any of the calculations, and to make the browser do all of them. The only way to do that was to make a user interact with the tool such that all data is calculated in the Browser.

I made this decision to prevent a user from having to deal with the render flickering as data is handed over from Python to the Browser. In doing so, I solved one issue but have caused a different UX issue: the user has to interact with the tool before they see any of the model statistics.

I could be completely wrong about this decision, and welcome any opinions you may have.

Copy link
Contributor

@horizon-blue horizon-blue left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, thanks for the explanation! I do believe would be more intuitive if we can render something by default, but the current way isn't too bad either, as long as we make it clear to the user that they actually need to interact with the widget so they wouldn't mistaken this as a bug (just like I did). One thing we can also consider as a follow up is to add a text or something in the widget that hints the users to click the drop down menu, till this issue gets resolved in Bokeh 3.0.

Since this pull request has been pending for a while, I'm just going to try to get it merged internally as-is to make it easier to work on follow up changes. I'm also going to revert the changes in the coin flipping tutorial until we are able to render the widget in docusaurus.

@ndmlny-qs
Copy link
Contributor Author

@horizon-blue That's a great idea, and I will propose an update with a more intuitive (click-me) interaction for the user in this PR.

I am currently refactoring and updating the notebook converter in a different branch that will embed Bokeh Applications in Docusaurus. My initial experiments showed it is possible, and I am working to automate it.

@ndmlny-qs
Copy link
Contributor Author

@horizon-blue let me know what you think of this design. For some reason the mouse isn't showing, but you can get the idea of the gif.

tool-with-better-css

- Moved all Bokeh embed from the tool class to the base class.
- Updated tool class docs.
- New method on the base class that contains an HTML template for more
  intuitive required UI for better UX.
- Reordered how the tool view is created in order to attach the new CSS
  class for the UI/UX updates.
@facebook-github-bot
Copy link
Contributor

@ndmlny-qs has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@ndmlny-qs has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@horizon-blue has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

# This source code is licensed under the MIT license found in the
# LICENSE file in the root directory of this source tree.

"""Accessor definition for extending Bean Machine `MonteCarloSamples` objects."""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you help me understand what is the purpose of this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've updated the docstrings to include an explicit example for what this is doing.

from beanmachine.ppl.inference.monte_carlo_samples import MonteCarloSamples


def serialize_bm(samples: MonteCarloSamples) -> Dict[str, List[List[float]]]:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

again, possible to use InferenceData instead of MCS?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is, however, I think we should leave this as a future feature request since these tools are pretty feature loaded as is right now.

- Refactors dosctrings from NumPy to Google style.
- Added a prettierrc for dev and updated the package.json, yarn.lock
  files
- Refactored the marginal1d tool for responsive size changes and updated
  to the format that allows Docusaurus integration.
- Added more documentation around the accessor objects with an explicit
  example in the docstring.
@facebook-github-bot
Copy link
Contributor

@ndmlny-qs has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@ndmlny-qs has updated the pull request. You must reimport the pull request before landing.

@facebook-github-bot
Copy link
Contributor

@horizon-blue has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator.

@ndmlny-qs ndmlny-qs deleted the marginal-tool branch October 20, 2022 16:47
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants