Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Data and Model Provenance to Reference Pages #1037

Closed
ellennickles opened this issue Aug 10, 2020 · 6 comments
Closed

Add Data and Model Provenance to Reference Pages #1037

ellennickles opened this issue Aug 10, 2020 · 6 comments

Comments

@ellennickles
Copy link
Member

Dear ml5 Community,

I compiled information about the origins of ml5’s pre-trained models and their datasets and am submitting a feature request to include this information on the ml5 website. Please see the details below.

→ Step 1: Describe the issue 📝

Did you find a bug? Want to suggest an idea for feature?

  • Documentation suggestion

When ml5’s beta version was announced in 2018, some of the library’s earliest models were described along with the “priority to be clear and transparent with where [each] model is sourced from, what data was used to train the model, and what data might be missing.”

My research, sponsored in part by the Clinic for Open Source Arts (COSA) at the University of Denver to support open source development in creative communities, continues ml5’s project to uncover these origin stories so the community might better understand how each model works and evaluate its functionality for their projects.

I documented my research approach, framework, and reflections with the intent to invite the ml5 Community to continue this work together. My hope is that ml5 visitors will contribute their own updates and corrections. I’ll note that this feature request relates to a mention by @joeyklee in this 2019 issue about how the community might "evaluate issues or biases in the pre-trained models especially for creative practice.”

→ Step 2: Screenshots or Relevant Documentation 🖼

Here's some helpful screenshots and/or documentation of the new feature

@joeyklee and I are discussing possible ways to integrate model and training dataset provenance. One idea: append this information to each model’s reference page, perhaps in the Acknowledgements’ section? (see below)

Also, any thoughts on the best place to share documentation of my process in ml5 ecosystem?

→ Step 3: Share an example of the issue 🦄

Here's some example code or a demonstration of my feature in this issue, separate GitHub repo, or in the https://editor.p5js.org or codepen/jsfiddle/Glitch/etc...

Here’s a mock-up of how this information might look on the SketchRNN model reference page:
sketchRNN_mockup

Other relevant information, if applicable

A very special thank you to the Clinic for Open Source Arts (COSA) at the University of Denver for supporting this project!
cosaSticker1_alt

@joeyklee
Copy link
Contributor

@ellennickles - thanks so much for this wonderful PR and bravo on such an extensive and intensive dive into these important details. This is a feature that we've long needed to add into our documentation but never managed to do so... until now!

I can propose the following:

  1. We can work on converting your google doc tables to markdown tables. Since some of the tables contain URLs we should definitely make sure to keep those in.
  2. We can then add in your contributions, as you've proposed, to the Acknowledgements section of each model page.

There are some handy tools for making "pretty" markdown tables in vsCode like https://marketplace.visualstudio.com/items?itemName=darkriszty.markdown-table-prettify for example in case this starts to look visually unwieldy.

I'm also open to suggestions!

@bomanimc
Copy link
Member

Hi @joeyklee and @ellennickles! I really love the Data and Model Provenance project! Such an amazing addition to our documentation.

Two questions:

  1. How should we address adding these details to new models we add, such as the new Facemesh and Handpose models? @ellennickles is this something you're continuing to research?
  2. Are there any other remaining loose ends related to this issue, or are we ready to close?

@ellennickles
Copy link
Member Author

Hi @bomanimc, thanks for checking in!

Super excited about the new Handpose and Facemesh models; I just shared them with some very interested students!

The data and model provenance research was funded by the Clinic for Open Source Arts (COSA) at the University of Denver this summer (thanks to help from @joeyklee 🙏). Maybe I can find more funding to continue the research?

As for this open issue, this is a minor note, but my personal website is linked on the ml5 reference pages, and I’m wondering if we can update that to my Github account instead (https://github.com/ellennickles/)?

I tried to suggest the change in #1047 before the merge to latest release, but maybe I didn’t do it right? Do you think there might be an easy fix for this?

@bomanimc
Copy link
Member

Thanks @ellennickles for sharing the new models! I can't speak to the funding topics, but I do hope that an organization supports you in continuing this amazing research in the future!

Also, I'm happy to swap out the links to your website with links to your GitHub. I'll post a PR shortly.

@bomanimc
Copy link
Member

Merged in the GitHub URL update!

@ellennickles
Copy link
Member Author

Thank you so much @bomanimc!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants