Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate storing notebook images as files, rather than inline #1672

Closed
Eric-Arellano opened this issue Jul 9, 2024 · 3 comments
Closed
Assignees

Comments

@Eric-Arellano
Copy link
Collaborator

Eric-Arellano commented Jul 9, 2024

Potential benefits to explore:

  • If Sharp can optimize the images better. https://nextjs.org/docs/messages/install-sharp
  • If it makes yarn dev and yarn build (with static site builder for preview app) faster
  • If it reduces the HTML file size, such that you can access the HTML quicker while images load. This means you can access our content sooner on low bandwidth
  • Right now, users can't expand inline images by clicking on them.
  • We don't have alt text - is that possible?

Look at our largest notebook files when investigating, such as by running dust guides/.

@Eric-Arellano
Copy link
Collaborator Author

Update: Frank thinks this will be non-trivial to do via Jupyter itself. One complication with storing the image as a file path is we still need to render the file, so the author would need to write code to save it and then load it to display.

Frank had an idea that we could explore adding a script during the closed source sync that extracts out the images from the notebook, saves them in files, and updates the notebook to point to them. Given that complexity and that the pain point isn't that bad, we think this is low priority.

@Eric-Arellano
Copy link
Collaborator Author

Eric-Arellano commented Dec 23, 2024

I agree with Frank's assessment: there is no easy way to have it so that we always save an image to a file without requiring writing code for it. https://discourse.jupyter.org/t/cell-magic-to-save-image-output-as-a-png-file/11906/4 and https://discourse.jupyter.org/t/any-way-to-automatically-save-some-outputs-as-separate-files/17651. Jupytext's idea of paired notebooks is interesting to decouple inputs from outputs, but it still requires having an .ipynb file for outputting the outputs.

We could add a post-process step in qiskit/documentation to extract the outputs of a Jupyter notebook and save them in other files. However, I don't think we should pursue it. Downsides:

  • Adds another step for content authors
  • Makes output in VSCode and GitHub less useful during development and code review

--

Instead, I agree with Frank's suggestion that this type of post-process step should happen in closed source when we sync the open source content. There, no humans are manually editing the notebook file so we could split out the image output from the cell into the dedicated files. Then, we could use Next.js's <Image> component to load the file.

This post-processing would complicate the download notebooks button because our local copy of the Jupyter notebook in the closed source repo would have had its images stripped out. However, I think it is perfectly acceptable for the downloaded notebook to be "clean" and not have any outputs at all inside. The user can refer back to the website or qiskit/documentation GitHub to see the outputs we have. So, our code for downloading notebooks should strip all outputs, both images (already removed during the open source sync process) and text. Update: Arnau suggested another option is to make the download be a zip file and include the images.

--

We need to add alt text to the images. I think we have two options:

  • Require setting alt text as cell metadata. lint for this in qiskit/documentation. iqp-channel-docs applies the alt text
  • Auto-generate description like "output from previous code". I'm skeptical this is helpful enough; it's useful to say things between things like "circuit diagram" and "sampler output"

--

Update Dec 26, 2024: Consider if #2520 changes anything given that we now expect users for MDX files to have a semi-manual step to use AVIF images.

@Eric-Arellano
Copy link
Collaborator Author

This has been merged in closed source with @frankharkins's suggested approach. It's working well 🚀

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Done
Development

No branches or pull requests

1 participant