-
Notifications
You must be signed in to change notification settings - Fork 46
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MLDeviceType npu #696
Add MLDeviceType npu #696
Conversation
FYI @mingmingtasd. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The definition of compute() in the tip-of-tree version of the spec says:
... either on a worker thread for the CPU execution, or on a GPU timeline for the submission of GPU workload on the command queue.
That sentence implies either CPU or GPU, and should be rewritten to accommodate NPU support.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per WG discussion https://www.w3.org/2024/06/13-webmachinelearning-minutes.html#t06 we agreed to merge this PR #696 after adding the note to MLContextOptions section. Wording in that note can be adjusted to ensure all perspectives discussed are considered.
@fdwr thank you for updating the PR. It now reflects the WG's latest discussion. @huningxin @inexorabletash can you review the latest. The Fallback topic warrants it own (non-blocking) issue. I'd propose to use @fdwr's overview #696 (comment) as a starting point for that discussion. @fdwr please feel free to open a separate issue copying your fallback overview there and cross-link to #623. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nits but overall LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be good to make a note that MLDeviceType is subject to change similar to the note on line 729, the line right above the comment https://github.com/webmachinelearning/webnn/pull/696/files#r1643272185
It is outside the scope of this change, but it would be preferable to remove the If this type cannot be satisfied, an "{{OperationError}}" {{DOMException}} is thrown,
wording as I'm not sure how that is compatible with the sentence preceding it and allow the implementation to better select the most appropriate underlying execution device for the workload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
SHA: 5c64074 Reason: push, by fdwr Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This change only adds the
MLDeviceType
enum (currently implemented in Chromium-based browsers for early testing) and not yet quantize/dequantize operators, as those are valuable but technically orthogonal given they are also useful for GPU and given the enum alone is still useful for non-quantized models using float16 weights.Of the 4 proposals from #623, we're starting with the simplest #1 (simple enum with system-decided fallback), and if additional experience warrants more complexity, we'll re-evaluate the other options later.
Wording recommendations for device fallback are welcome. Neural processing units are novel compared with more generic compute devices like CPU's/GPU's, given their limited operator support and inability to execute the entire graph at once.
Preview | Diff