Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add MLDeviceType npu #696

Merged
merged 9 commits into from
Jun 18, 2024
Merged

Add MLDeviceType npu #696

merged 9 commits into from
Jun 18, 2024

Conversation

fdwr
Copy link
Collaborator

@fdwr fdwr commented May 30, 2024

This change only adds the MLDeviceType enum (currently implemented in Chromium-based browsers for early testing) and not yet quantize/dequantize operators, as those are valuable but technically orthogonal given they are also useful for GPU and given the enum alone is still useful for non-quantized models using float16 weights.

Of the 4 proposals from #623, we're starting with the simplest #‌1 (simple enum with system-decided fallback), and if additional experience warrants more complexity, we'll re-evaluate the other options later.

Wording recommendations for device fallback are welcome. Neural processing units are novel compared with more generic compute devices like CPU's/GPU's, given their limited operator support and inability to execute the entire graph at once.


Preview | Diff

@fdwr
Copy link
Collaborator Author

fdwr commented May 30, 2024

FYI @mingmingtasd.

index.bs Outdated Show resolved Hide resolved
Copy link
Member

@inexorabletash inexorabletash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The definition of compute() in the tip-of-tree version of the spec says:

... either on a worker thread for the CPU execution, or on a GPU timeline for the submission of GPU workload on the command queue.

That sentence implies either CPU or GPU, and should be rewritten to accommodate NPU support.

index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
Copy link
Member

@anssiko anssiko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Per WG discussion https://www.w3.org/2024/06/13-webmachinelearning-minutes.html#t06 we agreed to merge this PR #696 after adding the note to MLContextOptions section. Wording in that note can be adjusted to ensure all perspectives discussed are considered.

index.bs Outdated Show resolved Hide resolved
@anssiko
Copy link
Member

anssiko commented Jun 17, 2024

@fdwr thank you for updating the PR. It now reflects the WG's latest discussion.

@huningxin @inexorabletash can you review the latest.

The Fallback topic warrants it own (non-blocking) issue. I'd propose to use @fdwr's overview #696 (comment) as a starting point for that discussion. @fdwr please feel free to open a separate issue copying your fallback overview there and cross-link to #623.

Copy link
Member

@inexorabletash inexorabletash left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits but overall LGTM

index.bs Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
index.bs Outdated Show resolved Hide resolved
Copy link

@mwyrzykowski mwyrzykowski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be good to make a note that MLDeviceType is subject to change similar to the note on line 729, the line right above the comment https://github.com/webmachinelearning/webnn/pull/696/files#r1643272185

It is outside the scope of this change, but it would be preferable to remove the If this type cannot be satisfied, an "{{OperationError}}" {{DOMException}} is thrown, wording as I'm not sure how that is compatible with the sentence preceding it and allow the implementation to better select the most appropriate underlying execution device for the workload.

index.bs Show resolved Hide resolved
Copy link
Contributor

@huningxin huningxin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@fdwr fdwr merged commit 5c64074 into webmachinelearning:main Jun 18, 2024
1 check passed
github-actions bot added a commit that referenced this pull request Jun 18, 2024
SHA: 5c64074
Reason: push, by fdwr

Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants