Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEAT(client,images): Add animated gif support #6638

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

GeneralUser01
Copy link
Contributor

The way images are handled for the gif file format when pasting and receiving one in the chat was here changed to support playing the animation. By default they do not play but whenever an animation is not running this is indicated by a play-icon, thereby differentiating them from still images on a glance. Whether the animation is paused is toggled by left-clicking it and reset by middle-clicking it. Saving a gif image file from log is also supported just as it is for other image file formats.

Additionally, one can also toggle video controls for animations via the log context menu on them. This enables the following features:

  • Jumping to any point of the animation via the video bar
  • Viewing the current time and total time of the animation
  • Switching caching of all frames on or off
  • Switching loop mode between "Unchanged", "Loop" and "No loop"
  • Traversing frame-by-frame backwards or forward
  • Changing and resetting the playback speed

To turn on caching most notably boosts performance when jumping to frames that are sequentially far apart due to QMovie only being able to switch frame in order from start to end and around when caching is off, though it usually plays fine regardless except for when playing in reverse in an animation with many frames. Reverse playback is also implemented here, so when decreasing the speed to less than zero it will play at that speed in reverse as expected, taking a speed-step that's twice as big if the speed would otherwise become zero so that the animation only pauses when it's not playing. As for loop mode, "Unchanged" is to use the in-built setting in the gif image file for how many times it is to repeat, whereas "Loop" and "No loop" override this behavior accordingly.

The main limitation is the character limit on text messages for images. Currently this is usually set to 128 KB, which is very small for a gif image file and on top of this requires the data to be 33% smaller before being sent in base64 encoding. This limit should be at least ten times larger in order to fit many gif image files, or this should apply to another limit specifically for animations. Other than that, the functionality for pasting images from the clipboard itself, instead of by a file path from it, could not be implemented for gif image files due to not being able to get compatible formatted data from the MIME data received.

Here are workarounds that can be applied to these limitations:

  • Configure the server to increase or disable the limit on image message length
  • Save copied gif images to file and then copy and paste those files

Lastly, if settings were to be added to this feature, then here are some suggestions for those settings:

  • Play animations by default
  • Show video controls by default
  • Cache all frames by default
  • Specify the default loop mode
  • Specify the default playback speed

Checks

Sorry, something went wrong.

@Hartmnt
Copy link
Member

Hartmnt commented Nov 26, 2024

Thank you very much for this contribution! Very cool!

Please note that this feature - with utmost certainty - will not land in the 1.5.x branch of Mumble, but in 1.6.x. With that being said, we are currently trying to get another 1.5.x release out and lack time to review this properly right now. There will be plenty of time for getting this merged, though!

I have not looked into the details of this PR yet, but since it is touching the user interface I would like you to take a look at the Mumble accessibility checklist and double check the application can still be used with accessibility in mind with your change applied.

Also take a look at the CI pipelines which are currently failing (you can disregard the Windows pipeline for now as we need to fix it in master right now).
The other thing is, we are currently requiring all PRs to be auto-formatted using clang-format in version 10. This is currently also failing for your PR. It will be enough to format everything once all the changes are reviewed and ready.

Again, please be patient for a proper review of this :)

@Hartmnt Hartmnt added client ui feature-request This issue or PR deals with a new feature labels Nov 26, 2024
@GeneralUser01
Copy link
Contributor Author

GeneralUser01 commented Nov 27, 2024

Thank you for being open to merge this PR and for the feedback!

Since the interactive aspects of this feature are part of custom text objects in the log, keyboard-related controls such as tabbing to navigate apply to user messages but it is unclear how this should be able to interact with items in messages if at all. As for contrasts which is another accessibility concern, is there any standard used in this project to test with to check if a contrast is high enough?

The only part of the CI pipelines that I'm unsure about how to resolve are the translations. There are more texts than what this feature brings that have no translation, so how is this to be handled?

@Hartmnt
Copy link
Member

Hartmnt commented Nov 27, 2024

Since the interactive aspects of this feature are part of custom text objects in the log, keyboard-related controls such as tabbing to navigate apply to user messages but it is unclear how this should be able to interact with items in messages if at all. As for contrasts which is another accessibility concern, is there any standard used in this project to test with to check if a contrast is high enough?

Currently, the individual text messages (objects) are not accessible. This is a TODO on our part. However, I am concerned that your new text objects containing a video player may create focus traps when tab-navigating. Another aspect that needs to be looked at is using a screen reader or text-to-speech for chat messages. It would be bad for example, if the TTS would read the entire base64 content of the video. Severe things like that. Don't worry too much about the contrast. When we look at this in detail, we can still talk about that.

The only part of the CI pipelines that I'm unsure about how to resolve are the translations. There are more texts than what this feature brings that have no translation, so how is this to be handled?

There is a script in the repository under the scripts folder called updatetranslations.py. If you run this from the repo root, it automatically creates a translation commit. But I would advice to do this at the very end, as your changed translation strings may need to be updated.

@GeneralUser01
Copy link
Contributor Author

Regarding focus traps, these are absent in my custom text objects since they are drawn using QPainter as usual and interaction with them is handled through a mouse press event on the document, hence there are no other widgets involved. Otherwise the controls would not be integrated with the document and that would among other things require moving all of these widgets depending on scrolling. While it seems this may work since my text objects already occupy their required space in the document, I'm unsure if making widgets act as part of the document overlaying my text objects is the more efficient and/or comprehensive approach.

I suspect there are other ways to add keybindings to this, as well as support for screen readers via connected UI elements somewhere, but with interactive items in the document this would require further time to investigate and implement. How much this is of interest and required for any given feature is unclear. When it comes to text-to-speech at least, it has a limit on how long messages it reads aloud, so no image would likely pass that check.

Does running the script for creating a translation commit at the very end mean running it when all of my changes are ready for now or after a proper review?

@Hartmnt
Copy link
Member

Hartmnt commented Nov 29, 2024

(you can disregard the Windows pipeline for now as we need to fix it in master right now).

The master branch Windows CI should now be working again, so please rebase your branch against current master.

Regarding focus traps, these are absent in my custom text objects since they are drawn using QPainter as usual and interaction with them is handled through a mouse press event on the document, hence there are no other widgets involved. Otherwise the controls would not be integrated with the document and that would among other things require moving all of these widgets depending on scrolling. While it seems this may work since my text objects already occupy their required space in the document, I'm unsure if making widgets act as part of the document overlaying my text objects is the more efficient and/or comprehensive approach.

I don't really get what you are trying to communicate here. Just try to navigate through the Mumble main window with TAB and SHIFT+TAB to see, if you Gifs somehow break navigation. I might consider requiring interacting with the Gifs with the keyboard (at least play/pause). But I am not sure, yet. Maybe a setting to always/never autoplay GIFs would be enough. For now, just make sure that you can reach every existing UI element by pressing TAB and SPACE+TAB.

When it comes to text-to-speech at least, it has a limit on how long messages it reads aloud, so no image would likely pass that check.

Nah, this is not an option. The TTS should skip GIFs (and images), or say something like "(animated) image" but not attempt to read it. I will test this as soon as I have time for this PR, but I just wanted to let you know that this kind of stuff should be considered.

Does running the script for creating a translation commit at the very end mean running it when all of my changes are ready for now or after a proper review?

I would say both, because the proper review might require you to change some stuff around. This is just a suggestion though. If you feel confident in your git abilities, you might as well create the translation commit now and keep updating the previous commit. You would probably have to drop and recreate the translation commit a few times, so adding it last is always easier in my opinion.

@GeneralUser01
Copy link
Contributor Author

The master branch Windows CI should now be working again, so please rebase your branch against current master.

Okay, I will rebase my branch soon, as well as apply adjustments which among other things would resolve the CI pipeline checks.

I don't really get what you are trying to communicate here. Just try to navigate through the Mumble main window with TAB and SHIFT+TAB to see, if you Gifs somehow break navigation. I might consider requiring interacting with the Gifs with the keyboard (at least play/pause). But I am not sure, yet. Maybe a setting to always/never autoplay GIFs would be enough.

I meant to convey that GIFs are not automatically a part of the tab-navigation, so they cause no focus traps, but it is so far unspecified if they should be able to receive focus and how focus can then be supported, whether it's through more widgets or something else.
I have added keybindings to the chat now, so by having it in focus and pressing RETURN (goes forward) or BACKSPACE (goes backwards) one can select an animation (highlighted accordingly) and then do any of the following actions:

  • SPACE/K - Play/pause the animation
  • Q - Reset the animation
  • V - Show/hide video controls
  • C - Turn caching of all frames on/off
  • O - Switch to the previous loop mode
  • L - Switch to the next loop mode
  • , - Jump to the preceding frame
  • . - Jump to the next frame
  • N - Jump 5 frames backwards
  • M - Jump 5 frames forward
  • -/S - Decrease playback speed by 5%
  • +/D - Increase playback speed by 5%
  • W - Decrease playback speed by 25%
  • E - Increase playback speed by 25%
  • R - Reset playback speed

If I get into implementing any settings then turning on/off that GIFs autoplay would be the first one, but GIFs can now (once this change is pushed) be interacted with via keyboard anyway.

For now, just make sure that you can reach every existing UI element by pressing TAB and SPACE+TAB.

Yes, I have done this and I know the regular tab-navigation, where TAB goes forward, SHIFT+TAB goes backwards and both wrap around, but what is SPACE+TAB supposed to do? I tried this too but it didn't appear to do anything that had to do with the tab-navigation.

Nah, this is not an option. The TTS should skip GIFs (and images), or say something like "(animated) image" but not attempt to read it. I will test this as soon as I have time for this PR, but I just wanted to let you know that this kind of stuff should be considered.

I agree that relying on TTS settings and message length is not enough, though it makes any issues involving TTS less commonly encountered. However, the GIFs are sent the same way as the regular images and because of it both would be skipped by TTS, as well as making them go by the image message limit instead of the text message limit, therefore TTS would not be an issue with this feature. There is also the screen reader and more perhaps, so indeed the compatibility with all of the affected accessibility features should be considered.
Altough it's unrelated to this feature, the TTS tries to read purely visual content if it's ASCII art in a code-block (pre element) though it somewhat depends on the arrangement of characters. If there was a way to set conditions for what TTS attempts to read, then that would help with excluding such content from TTS completely.

I would say both, because the proper review might require you to change some stuff around. This is just a suggestion though. If you feel confident in your git abilities, you might as well create the translation commit now and keep updating the previous commit. You would probably have to drop and recreate the translation commit a few times, so adding it last is always easier in my opinion.

Sure, I agree that I might as well commit the translation now, finishing most of it if not all of it sooner rather than later. I have updated/replaced commits before so that would not be too troublesome. I will commit the translation and update it if needed.

Verified

This commit was created on GitHub.com and signed with GitHub’s verified signature. The key has expired.
The way images are handled for the `gif` file format when pasting and receiving one in the chat was here changed to support playing the animation. By default they do not play but whenever an animation is not running this is indicated by a play-icon, thereby differentiating them from still images on a glance. Whether the animation is paused is toggled by left-clicking it and reset by middle-clicking it. Saving a `gif` image file from log is also supported just as it is for other image file formats.

Additionally, one can also toggle video controls for animations via the log context menu on them. This enables the following features:

- Jumping to any point of the animation via the video bar
- Viewing the current time and total time of the animation
- Switching caching of all frames on or off
- Switching loop mode between "Unchanged", "Loop" and "No loop"
- Traversing frame-by-frame backwards or forward
- Changing and resetting the playback speed

Alternatively, animations can be controlled via keybindings as follows:

- ENTER - Select the next animation
- BACKSPACE - Select the previous animation
- CTRL+0-9 - Select animation by the index entered
- ESCAPE - Deselect animations
- SPACE/K - Play/pause
- Q - Reset playback
- V - Show/hide video controls
- C - Turn caching of all frames on/off
- O - Switch to the previous loop mode
- L - Switch to the next loop mode
- , - Jump to the preceding frame
- . - Jump to the next frame
- N - Jump 5 frames backwards
- M - Jump 5 frames forward
- H - Jump 1 second backwards
- J - Jump 1 second forward
- F - Jump 5 seconds backwards
- G - Jump 5 seconds forward
- -/S - Decrease playback speed by 5%
- +/D - Increase playback speed by 5%
- W - Decrease playback speed by 25%
- E - Increase playback speed by 25%
- R - Reset playback speed

To turn on caching most notably boosts performance when jumping to frames that are sequentially far apart due to `QMovie` only being able to switch frame in order from start to end and around when caching is off, though it usually plays fine regardless except for when playing in reverse in an animation with many frames. Reverse playback is also implemented here, so when decreasing the speed to less than zero it will play at that speed in reverse as expected, taking a speed-step that's twice as big if the speed would otherwise become zero so that the animation only pauses when it's not playing. As for loop mode, "Unchanged" is to use the in-built setting in the `gif` image file for how many times it is to repeat, whereas "Loop" and "No loop" override this behavior accordingly.

The main limitation is the character limit on text messages for images. Currently this is usually set to 128 KB, which is very small for a `gif` image file and on top of this requires the data to be 33% smaller before being sent in base64 encoding. This limit should be at least ten times larger in order to fit many `gif` image files, or this should apply to another limit specifically for animations. Other than that, the functionality for pasting images from the clipboard itself, instead of by a file path from it, could not be implemented for `gif` image files due to not being able to get compatible formatted data from the MIME data received.

Here are workarounds that can be applied to these limitations:

- Configure the server to increase or disable the limit on image message length
- Save copied `gif` images to file and then copy and paste those files

Lastly, if settings were to be added to this feature, then here are some suggestions for those settings:

- Play animations by default
- Show video controls by default
- Cache all frames by default
- Specify the default loop mode
- Specify the default playback speed
@GeneralUser01 GeneralUser01 reopened this Dec 7, 2024
Copy link
Member

@Krzmbrzl Krzmbrzl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dislike that we are hand-wavy parsing HTML documents. This more or less guarantees that it will break at some point in the future simply because this is a drastic simplification of how HTML is structured. We already have a dependency to Poco::XML, so I would suggest using that for parsing and querying purposes.

Furthermore, I believe the code associated with the animation logic should be outsourced into a dedicated class or something like that. It is way too large to just embed into the regular logic of handling log events.

Note: This was more of a high-level review (i.e. I didn't a full read through all changes)

Comment on lines +375 to +378
/// Removes the content of the client's log and deletes the objects used by text objects.
void clearDocument();
/// Alternates between showing and hiding video controls for animated images.
void toggleVideoControls();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't these be member functions of the Log class?

if (isAnimation) {
bool areVideoControlsOn = AnimationTextObject::areVideoControlsOn;
QString actionFirstWord = areVideoControlsOn ? "Hide" : "Show";
menu->addAction(tr("%1 Video Controls").arg(actionFirstWord), this, SLOT(toggleVideoControls(void)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

New code shouldn't use the SLOT macro. Use the overload that uses a function pointer instead.

@@ -118,6 +118,8 @@ class Log : public QObject {
// versions.
static const MsgType msgOrder[];

enum TextObjectType { Animation = QTextFormat::UserObject };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
enum TextObjectType { Animation = QTextFormat::UserObject };
enum class TextObjectType { Animation = QTextFormat::UserObject };

Comment on lines +644 to +662
auto searchForAnimationHeader =
[html, htmlEndIndex](qsizetype previousImgEndIndex) -> std::tuple< bool, qsizetype, qsizetype > {
bool isCompatibleHeader;
qsizetype imgStartIndex = -1;
qsizetype base64StartIndex;
do {
imgStartIndex = html.indexOf("<img", (imgStartIndex == -1 ? previousImgEndIndex : imgStartIndex) + 1);
base64StartIndex = html.indexOf(',', imgStartIndex) + 1;
// Get the relevant part of the header through decoding the first 4
// characters where each character is 6 bits in base64 encoding
// and each decoded character is 1 byte, with 3 bytes consisting of 24 bits:
QString imageFirstThreeBytes =
base64StartIndex > 0 && htmlEndIndex > base64StartIndex + 2 ? qvariant_cast< QString >(
QByteArray::fromBase64(qvariant_cast< QByteArray >(html.sliced(base64StartIndex, 4))))
: "";
isCompatibleHeader = imageFirstThreeBytes == "GIF";
} while (!isCompatibleHeader && imgStartIndex != -1);
return std::make_tuple(isCompatibleHeader, imgStartIndex, base64StartIndex);
};
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe there are some bound checks missing (e.g. the while loop doesn't check whether the html might be empty)

Comment on lines +101 to +106
static const int videoBarHeight = 4;
static const int underVideoBarHeight = 20;
static const int cacheOffsetX = -170;
static const int loopModeOffsetX = -130;
static const int frameTraversalOffsetX = -90;
static const int speedOffsetX = -30;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hardcoding these values will likely make the UI break with certain display configurations (i.e. different resolution, different scaling factors, whatnot)

Comment on lines +416 to +417
default:
return "Undefined";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of using a default clause, do the return after the switch. This case, compilers will warn if at some point in the future there is a new mode that is not yet handled in the switch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client feature-request This issue or PR deals with a new feature ui
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants