Update CompressedImage #237

ahcorde · 2024-02-07T14:08:03Z

The compressedImage message hasn’t been updated since 2009 . This message was probably designed for mainly for jpeg and png according to the history and comments in the message definition.

During this time some other compressedImage types have appeared, for example:

The idea of this PR is to try to refresh this message type allowing new compression algorithms to set all the required information.

As a reference, I was working on a SVTAV1 image transport plugin and I had to encode the metadata inside the payload which is ugly.

This new message type will allow to use the same message type in all the image_transport plugins (at least released in rolling)

ffmpeg
Jpeg and png
Theora
SVTAV1
AVIF
others ?

FYI @calderpg-tri @IanTheEngineer

Signed-off-by: Alejandro Hernández Cordero <[email protected]>

clalancette · 2024-02-09T14:15:26Z

sensor_msgs/msg/CompressedImage.msg

+uint32 height                # image height, that is, number of rows
+uint32 width                 # image width, that is, number of columns
+
+string pixel_format          # Specifies the format of the data


This seems like a needless difference. I understand that this is slightly more descriptive, but I also don't think it is worth the headache for downstream users to change this. So I'll suggest putting this back to format.

Note that encoding is the current spelling for this in sensor_msgs/Image, which is just wrong in almost every case. Everything else (OpenCV, commercial machine vision libraries, etc) refers to this as "pixel format". Unfortunately, "encoding" is even more confusing when dealing with compressed data, so if there is a chance to fix this properly we should.

For compatibility we shouldn't rename this field but just update it'd documentation to state that it's expecting the pixel_format string as per OpenCV). It's still clear what the content is and this will avoid breaking any existing code.

clalancette · 2024-02-09T14:16:23Z

sensor_msgs/msg/CompressedImage.msg

+uint32 height                # image height, that is, number of rows
+uint32 width                 # image width, that is, number of columns


Prior to this PR, how are users of this message determining this?

As this was encoded into the payload before, we need some more semantics about when it should be used and when not. It's not great to have data that can potentially disagree.

clalancette · 2024-02-09T14:16:43Z

sensor_msgs/msg/CompressedImage.msg

+
+uint64 sequence_number       # sequence number
+uint64 flags                 # flags (for example: KEYFRAME)
+uint8 is_bigendian           # is this data bigendian?


This seems like it would be a bool.

clalancette · 2024-02-09T14:17:40Z

sensor_msgs/msg/CompressedImage.msg

@@ -38,4 +41,10 @@ string format                # Specifies the format of the data
                             #   need for successful decoding of the image. Refer to
                             #   documentation of the other transports for details.

+string compression_type      # Compression type used (jpeg, png, theora, etc)
+
+uint64 sequence_number       # sequence number


I don't think I agree that this should have a sequence number. A compressed image doesn't logically have one of these, and we don't have this in Image.msg, for instance.

This does seem to be breaking the encapsulation of this message standing on it's own.

clalancette · 2024-02-09T14:19:17Z

sensor_msgs/msg/CompressedImage.msg

+string compression_type      # Compression type used (jpeg, png, theora, etc)
+
+uint64 sequence_number       # sequence number
+uint64 flags                 # flags (for example: KEYFRAME)


Again, I don't think I agree that this should be here. A compressed image doesn't represent anything other than data, and I don't think we should be adding "external" metadata like this.

In general, this argument is about whether there should be any metadata fields in the message, or if the compression plugin should just shove its metadata into the binary blob in some bespoke fashion. To the extent that a common set of metadata exists, I think it is preferable to put that metadata in the message so that message introspection tooling can see it, rather than putting this all in binary blob form and practically inaccessible.

In general, this argument is about whether there should be any metadata fields in the message

I disagree. I absolutely think we should have metadata in this message that is intrinsic to the CompressedImage. Things like width, height, format, compression_type, and is_bigendian all belong in here.

But metadata about the larger context in which this data is delivered doesn't seem to belong, in my opinion. The exception to this is of course std_msgs/Header, which is already here and is heavily used with the rest of our tooling.

I definitely understand the need for things like a sequence_number, a keyframe flag, and the like. But those seem like things that should be in a custom message, composed of a CompressedImage plus the metadata, like:

# CompressedKeyFrame sensor_msgs/CompressedImage compressed_image int sequence_number

But those seem like things that should be in a custom message, composed of a CompressedImage plus the metadata...

Frankly, I really think that if the outcome isn't an improved-for-everyone message, the right answer is for image_transport to just define its own internal message type that it can change on its own. I don't think that's better for the community, but if we have to introduce a new type for image_transport uses, having it inherit all the inertia that accompanies being a part of core message types isn't an improvement either.

I think that Chris is right that the higher level information such as keyframes and sequence numbers deserve to be in a higher level concept than just the compressed image. Things like Key Frames and sequences are only valid if they're associated with a specific stream. Where a standalone compressed image may or may not be used in those applications.

In particular if someone muxes two compressed Image topics together, the sequence numbers become meaningless unless you're associating them with a name stream or the like.

There's the approach above of using a hierarchy of messages/wrapper messages. The other alternative would be to send a parallel stream of messages in the same way that we do with camera_info. There's a parallel stream of camera info which can be associated with the image by timestamp + frame_id and with that this extra metadata can be applied in the image_transport process, but can be ignored by those who just want to access the raw images. And those images should standalone, in log files etc. Thus there would be CompressedImages and then there would be a CompressedImageStream protocol/standard for how to send along a series of compressed images with keyframes and potentially other metadata in a parallel CompressedImageStreamInfo or the like.

tfoote · 2024-02-20T18:48:22Z

sensor_msgs/msg/CompressedImage.msg

+uint32 height                # image height, that is, number of rows
+uint32 width                 # image width, that is, number of columns
+
+string pixel_format          # Specifies the format of the data


For compatibility we shouldn't rename this field but just update it'd documentation to state that it's expecting the pixel_format string as per OpenCV). It's still clear what the content is and this will avoid breaking any existing code.

tfoote · 2024-02-20T18:49:50Z

sensor_msgs/msg/CompressedImage.msg

@@ -38,4 +41,10 @@ string format                # Specifies the format of the data
                             #   need for successful decoding of the image. Refer to
                             #   documentation of the other transports for details.

+string compression_type      # Compression type used (jpeg, png, theora, etc)


Is there a backwards compatible default expectation that should be documented and implemented as a fallback if unset?

tfoote · 2024-02-20T18:50:55Z

sensor_msgs/msg/CompressedImage.msg

@@ -38,4 +41,10 @@ string format                # Specifies the format of the data
                             #   need for successful decoding of the image. Refer to
                             #   documentation of the other transports for details.

+string compression_type      # Compression type used (jpeg, png, theora, etc)
+
+uint64 sequence_number       # sequence number


This does seem to be breaking the encapsulation of this message standing on it's own.

tfoote · 2024-02-22T03:39:11Z

sensor_msgs/msg/CompressedImage.msg

+string compression_type      # Compression type used (jpeg, png, theora, etc)
+
+uint64 sequence_number       # sequence number
+uint64 flags                 # flags (for example: KEYFRAME)


I think that Chris is right that the higher level information such as keyframes and sequence numbers deserve to be in a higher level concept than just the compressed image. Things like Key Frames and sequences are only valid if they're associated with a specific stream. Where a standalone compressed image may or may not be used in those applications.

In particular if someone muxes two compressed Image topics together, the sequence numbers become meaningless unless you're associating them with a name stream or the like.

There's the approach above of using a hierarchy of messages/wrapper messages. The other alternative would be to send a parallel stream of messages in the same way that we do with camera_info. There's a parallel stream of camera info which can be associated with the image by timestamp + frame_id and with that this extra metadata can be applied in the image_transport process, but can be ignored by those who just want to access the raw images. And those images should standalone, in log files etc. Thus there would be CompressedImages and then there would be a CompressedImageStream protocol/standard for how to send along a series of compressed images with keyframes and potentially other metadata in a parallel CompressedImageStreamInfo or the like.

tfoote · 2024-02-22T03:40:17Z

sensor_msgs/msg/CompressedImage.msg

+uint32 height                # image height, that is, number of rows
+uint32 width                 # image width, that is, number of columns


As this was encoded into the payload before, we need some more semantics about when it should be used and when not. It's not great to have data that can potentially disagree.

christianrauch · 2024-09-04T21:06:16Z

Can I suggest that the CompressedImage definition is not updated? This will change the hash and make recorded data incompatible with new subscribers. I would rather create a new CompressedImage2 if the definition has to be updated.

I found the use of "formats" and "encodings" in the images also confusing. Eventually, all image data, raw or compressed, is just an unstructured binary blob that has to be interpreted according to the encoding or format. Can't we merge this and simply use the binary blob together with the FourCC to describe raw (e.g. RGB3 for a "standard" colour image with 8bit per pixel and channel) and compressed (e.g. JPEG for jpeg and AV1F for AV1) data? There is much more documentation on how to interpret memory via the FourCC than on custom encoding / format.

Update CompressedImage

bccb313

Signed-off-by: Alejandro Hernández Cordero <[email protected]>

ahcorde self-assigned this Feb 7, 2024

ahcorde requested a review from tfoote as a code owner February 7, 2024 14:08

clalancette reviewed Feb 9, 2024

View reviewed changes

tfoote reviewed Feb 22, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update CompressedImage #237

Update CompressedImage #237

ahcorde commented Feb 7, 2024

clalancette Feb 9, 2024

calderpg-tri Feb 15, 2024

tfoote Feb 20, 2024

clalancette Feb 9, 2024

tfoote Feb 22, 2024

clalancette Feb 9, 2024

clalancette Feb 9, 2024

tfoote Feb 20, 2024

clalancette Feb 9, 2024

calderpg-tri Feb 15, 2024

clalancette Feb 15, 2024

calderpg-tri Feb 15, 2024

tfoote Feb 22, 2024

tfoote Feb 20, 2024

tfoote Feb 20, 2024

tfoote Feb 20, 2024

tfoote Feb 22, 2024

tfoote Feb 22, 2024

christianrauch commented Sep 4, 2024

		uint32 height # image height, that is, number of rows
		uint32 width # image width, that is, number of columns

Update CompressedImage #237

Are you sure you want to change the base?

Update CompressedImage #237

Conversation

ahcorde commented Feb 7, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

christianrauch commented Sep 4, 2024