doc: propose design for indicating which tensors to compress #2700

rkuester · 2024-09-25T19:52:18Z

No description provided.

rkuester · 2024-09-26T16:36:43Z

This is the start of some documentation on the compression process and tools, which are in prototype and are being slowly merged (#2636). Currently, some of the design it discusses is proposed—comments welcome. Eventually that will be rewritten as descriptive once the design settles.

rascani · 2024-09-26T21:33:33Z

tensorflow/lite/micro/compression/README.md

+1. Allow tensors to be excluded from consideration by a command-line option.
+
+1. Disable automatic discovery and take an explicit list of tensors to compress
+   by command-line option.


This approach generally seems to be: we'll try to do it all for you, but give you an escape hatch if you need to override. While that might just work in most cases, I think I'd generally prefer to go the route of "compress the tensors we tell you to."

Every compressed tensor comes with a trade off of performance for size. We need the user to opt into this for every single time it is used, because only the user knows what the best choice is. While one could argue that they've already done that during the binning stage, there's also the possibility that the heuristic could pick up additional tensors that could be compressed just based on the number of unique values. Then we could compress tensors that weren't intended to be, and we've cost the user some additional performance.

I think a list of tensors to compress would generally be sufficient. I would be okay with defaulting the bit_width to the lowest value possible for the given number of unique values, but an override for that also seems like a nice to have.

Okay, I'll head in that direction: Require an explicit list of tensors via a configuration file (per comment below)

Since there's already a configuration file and no need to keep things simple enough for a command line argument, perhaps requiring explicit bit-width specifications likewise helps the user verify the result, and absolves the compression tool of knowing operator capabilities or imposing arbitrary limits in order to provide sanity checks. With the input model and the configuration are in separate files, there's the possibility of a mismatch. The compressor could give feedback like:

error: tensor 1,33 has too many values to be compressed with 4-bit indices

warning: tensor 0,55 was compressed with 8-bit indices, but could have been compressed with 2-bit indices

A use case for specifying a bit-widths larger than necessary is to measure the latency and size of larger widths without re-binning the model.

That's a good point. It's possible as models are updated that the tensor numbers may change. Perhaps we should consider using the tensor names as the identifiers?

rascani · 2024-09-26T21:43:28Z

tensorflow/lite/micro/compression/README.md

+## Alternative Designs
+
+1. The list of tensors to compress could be communicated via metadata added to
+   the model by the binning stage, rather than via command-line options.


I'd generally recommend the tool be capable of being imported as a python module or called independently on a command line. In that sense, any configuration should be representable as python objects. A simple json or yaml config would likely be sufficient to populate those.

rascani · 2024-09-26T21:45:42Z

tensorflow/lite/micro/compression/README.md

+
+## Alternative Designs
+
+1. The list of tensors to compress could be communicated via metadata added to


One thing to watch for: multiple tensors could point at the same buffer. If that is the case, we need both tensors to be added to the list for compressing. I'd suggest throwing an error if we detect that not all tensors with the same buffer are being compressed.

Alternatively, we could compress by buffer index, but that's harder information to get.

@rascani @rkuester I believe I already added code for handling multiple tensors pointing to the same buffer (I think I added that code, memory is hazy already).

ddavis-2015 · 2024-10-01T01:33:22Z

@rkuester A general comment: Alternate axis tensors also need to be handled. This could be done automatically by scanning the operator inputs for the specified tensor, to see if the operation (i.e. DEPTHWISE_CONV) requires special handling in constructing the value table.

Or done automatically without consulting the operators. Instead just look at the tensor scale->size and quantized_dimension, and if the quantized_dimension is other than zero, then do alt-axis construction of the value table. The quantized_dimension == 0 being the norm for per-channel quantized tensors is an assumption though.

Or it could just be done manually by an additional command line option (as is currently done).

doc: propose design for indicating which tensors to compress

c294b05

rkuester requested review from rascani and suleshahid September 26, 2024 16:37

rkuester self-assigned this Sep 26, 2024

rascani reviewed Sep 26, 2024

View reviewed changes

rkuester closed this Oct 22, 2024

rkuester deleted the feature-compression-tensor-selection branch October 22, 2024 17:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

doc: propose design for indicating which tensors to compress #2700

doc: propose design for indicating which tensors to compress #2700

rkuester commented Sep 25, 2024

rkuester commented Sep 26, 2024

rascani Sep 26, 2024

rkuester Sep 27, 2024

rascani Sep 27, 2024

rascani Sep 26, 2024

rascani Sep 26, 2024

ddavis-2015 Sep 27, 2024

ddavis-2015 commented Oct 1, 2024 •

edited

Loading


		## Alternative Designs

		1. The list of tensors to compress could be communicated via metadata added to

doc: propose design for indicating which tensors to compress #2700

doc: propose design for indicating which tensors to compress #2700

Conversation

rkuester commented Sep 25, 2024

rkuester commented Sep 26, 2024

rascani Sep 26, 2024

Choose a reason for hiding this comment

rkuester Sep 27, 2024

Choose a reason for hiding this comment

rascani Sep 27, 2024

Choose a reason for hiding this comment

rascani Sep 26, 2024

Choose a reason for hiding this comment

rascani Sep 26, 2024

Choose a reason for hiding this comment

ddavis-2015 Sep 27, 2024

Choose a reason for hiding this comment

ddavis-2015 commented Oct 1, 2024 • edited Loading

ddavis-2015 commented Oct 1, 2024 •

edited

Loading