Huggingface integration #181

ivelin · 2023-04-04T11:57:36Z

This maybe a very big stretch goal given current model size and ops limitations, but inspirationally worth considering.
Could be a big unlock of applications on ezkl, if it is simple enough for app devs to to plug ezkl into the huggingface transformer APIs and demo spaces.

jasonmorton · 2023-04-05T13:31:33Z

This is a great idea! What kind of integration are you looking for? Maybe we could do some automatic downloading and conversion of models based on their huggingface identifiers?

ivelin · 2023-04-05T15:06:05Z

For example being able to inject ezkl step in an inference pipeline that outputs a proof that huggingface model XYZ located at repo "mymodels/xyz" with git revision "abcd" and corresponding hash "efgh" ran on some private data input and produced public output "ijk".

A specific use case that comes to mind is being able to run a fine tuned DocVQA model on utility bills for KYC verification. Confirm that the secret input PDF is a valid (non-fake) recent utility bill from a US based zip code and belongs to Bob Hackman.

jasonmorton · 2023-04-05T15:29:32Z

That makes sense. @JSeam2 is working on some of the prerequisites for this.

JSeam2 · 2023-04-05T19:23:16Z

One possibility right now is to manually extract the onnx files and obtain the outputs from the hugging face models. The challenge is that the compute requirements for huggingface models might require a distributed setup to run.

If the model is small enough, a work around is to use python subprocess to run ezkl and use the built ezkl program within python, however, it's still fairly clunky. I used that approach for a hackathon and created a python server for a similar purpose. I have written some code regarding the subprocess approach that could be useful for you. https://github.com/JSeam2/zkml-server/blob/main/app.py

At the moment I'm still working on bindings with pyo3 on a fork of ezkl https://github.com/jseam2/ezkl/tree/python. When this is ready for production the goal is to merge the changes into the main repo and expose the bindings to the pyezkl repo which would offer more expressivity within python. The rust bindings are limited to pep387 compatibility so the feature set used for the rust bindings would be minimal.

rmlearney-digicatapult · 2024-08-01T13:52:59Z

Hi @JSeam2, could I extend the comment by @ivelin and ask if it's possible to embed metadata into the compiled ezkl such as the points he suggested or arbitrary length cryptographic commitments to e.g. training dataset?

jasonmorton · 2024-08-01T14:58:34Z

@rmlearney-digicatapult that's a very interesting idea. We could add a metadata field like this and leave it to the user to decide how to use it initially.

rmlearney-digicatapult · 2024-08-01T15:30:27Z

🙏🙂

JSeam2 added the enhancement New feature or request label May 10, 2023

mmagician mentioned this issue Dec 19, 2023

Model table fails to print / failure parsing the model #670

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Huggingface integration #181

Huggingface integration #181

ivelin commented Apr 4, 2023

jasonmorton commented Apr 5, 2023

ivelin commented Apr 5, 2023 •

edited

Loading

jasonmorton commented Apr 5, 2023

JSeam2 commented Apr 5, 2023

rmlearney-digicatapult commented Aug 1, 2024 •

edited

Loading

jasonmorton commented Aug 1, 2024

rmlearney-digicatapult commented Aug 1, 2024

Huggingface integration #181

Huggingface integration #181

Comments

ivelin commented Apr 4, 2023

jasonmorton commented Apr 5, 2023

ivelin commented Apr 5, 2023 • edited Loading

jasonmorton commented Apr 5, 2023

JSeam2 commented Apr 5, 2023

rmlearney-digicatapult commented Aug 1, 2024 • edited Loading

jasonmorton commented Aug 1, 2024

rmlearney-digicatapult commented Aug 1, 2024

ivelin commented Apr 5, 2023 •

edited

Loading

rmlearney-digicatapult commented Aug 1, 2024 •

edited

Loading