Add MLDeviceType npu #696

fdwr · 2024-05-30T16:19:56Z

This change only adds the MLDeviceType enum (currently implemented in Chromium-based browsers for early testing) and not yet quantize/dequantize operators, as those are valuable but technically orthogonal given they are also useful for GPU and given the enum alone is still useful for non-quantized models using float16 weights.

Of the 4 proposals from #623, we're starting with the simplest #‌1 (simple enum with system-decided fallback), and if additional experience warrants more complexity, we'll re-evaluate the other options later.

Wording recommendations for device fallback are welcome. Neural processing units are novel compared with more generic compute devices like CPU's/GPU's, given their limited operator support and inability to execute the entire graph at once.

Preview | Diff

fdwr · 2024-05-30T16:20:51Z

FYI @mingmingtasd.

index.bs

inexorabletash

The definition of compute() in the tip-of-tree version of the spec says:

... either on a worker thread for the CPU execution, or on a GPU timeline for the submission of GPU workload on the command queue.

That sentence implies either CPU or GPU, and should be rewritten to accommodate NPU support.

index.bs

anssiko

Per WG discussion https://www.w3.org/2024/06/13-webmachinelearning-minutes.html#t06 we agreed to merge this PR #696 after adding the note to MLContextOptions section. Wording in that note can be adjusted to ensure all perspectives discussed are considered.

index.bs

anssiko · 2024-06-17T09:07:38Z

@fdwr thank you for updating the PR. It now reflects the WG's latest discussion.

@huningxin @inexorabletash can you review the latest.

The Fallback topic warrants it own (non-blocking) issue. I'd propose to use @fdwr's overview #696 (comment) as a starting point for that discussion. @fdwr please feel free to open a separate issue copying your fallback overview there and cross-link to #623.

inexorabletash

nits but overall LGTM

index.bs

mwyrzykowski

It would be good to make a note that MLDeviceType is subject to change similar to the note on line 729, the line right above the comment https://github.com/webmachinelearning/webnn/pull/696/files#r1643272185

It is outside the scope of this change, but it would be preferable to remove the If this type cannot be satisfied, an "{{OperationError}}" {{DOMException}} is thrown, wording as I'm not sure how that is compatible with the sentence preceding it and allow the implementation to better select the most appropriate underlying execution device for the workload.

index.bs

huningxin

LGTM!

SHA: 5c64074 Reason: push, by fdwr Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>

fdwr added 4 commits May 30, 2024 08:51

Add MLDeviceType npu

baea0ee

Minor wording

70d3465

Add note about fallback

8574b6e

Remove 'can' regarding performance

c003e95

fdwr requested review from inexorabletash and huningxin May 30, 2024 16:20

a-sully reviewed May 30, 2024

View reviewed changes