Add event buffering for cloaking user input patterns #149

ArrayBolt3 · 2024-10-01T05:52:33Z

Goal

Implement the functionality of kloak (a tool designed to hide biometric behavior patterns in keystrokes and mouse movements) in qubes-gui-daemon. This PR will implement the functionality requested in QubesOS/qubes-issues#1850 and fleshed out further in QubesOS/qubes-issues#8541. It will also close QubesOS/qubes-issues#8534 as it will no longer be necessary.

TODOs

Test rigorously on Qubes R4.3 (an earlier iteration of the code has been smoke-tested on Qubes R4.2, this hasn't been tested at all on R4.3 yet)

Fixed TODOs:

Figure out why the domU window occasionally freezes until another input event is sent - we aren't buffering info coming from domU to dom0 so why this is happening is a mystery to me, and something for later investigation. (Solved, Add event buffering for cloaking user input patterns #149 (comment))
Potentially change how events are treated (do some events have to operate in pairs for best results?). (Lots of X events are now not buffered in the latest implementation. Only ones that look valuable to buffer are buffered.)
Make the delay duration user-configurable (right now it's hardcoded to 150 milliseconds). (Implemented.)
Allow configuring event delay duration for individual VMs buffering (right now it is applied equally to all VMs). (Implemented.)
Get the configuration code working and test it. (Solved, this ended up requiring a change to core-admin-client which I will be submitting as a separate PR.)
Ensure all new code adheres to Qubes OS standards (didn't have time to finish that up) (should be done now)

Rationale

Kloak, the inspiration for this PR, is a user input buffering and obfuscation tool. It intercepts keyboard and mouse events at the evdev layer, holds them in a queue for release at a later scheduled time, then releases them to the applications they were intended for periodically. By adding random noise into the user's input patterns, kloak aims to make otherwise recognizable patterns in user behavior (such as keystroke rhythm and mouse movement patterns) too erratic to be used as a method of identifying the user. This is potentially very useful especially for Whonix Workstation domUs, as it denies an adversary access to a remarkably effective biometric fingerprinting mechanism they could otherwise access without specialized tools.

Kloak is currently able to operate directly in Qubes domUs if (and only if!) gui-agent-virtual-input-device is enabled for the domU in question. Even in these instances, only keyboard events are anonymized, and additionally the domU must have an evdev X driver installed. This is less than ideal from a functionality standpoint, and as @DemiMarie has explained in QubesOS/qubes-issues#8541 it will eventually stop working entirely. There's also the possibility of malware compromise in the domU resulting in the deanonymization of the user. For these reasons, enabling the use of evdev in domUs and running kloak in the domU is not a good solution.

The other obvious option is to run kloak directly in dom0. This has several disadvantages:

kloak can now potentially wreak havoc on the user's ability to use their computer. If a bug in kloak locks up the keyboard, or the user does something inadvisable like setting a 20-second event delay, regaining control of the system could be difficult or impossible without doing a hard reset (or worse, booting an external USB in order to chroot into dom0 and disable kloak).
Application of kloak's functionality becomes all-or-nothing - you either anonymize all keyboard and mouse input everywhere, or you anonymize none of it. This could make management of dom0 annoying with larger delay times, and it could prevent the user from making use of applications or websites that require input pattern telemetry to function (such as some bank websites).
kloak's configuration options similarly apply globally. One might want a comfortable delay of only 25ms in a domU they expect to be safe, but wish to use an extremely long one like 1000ms in a domU they believe is compromised and actively exfiltrating data. With kloak running directly in dom0, this is impossible.

This PR implements a third option - inserting the functionality of kloak directly into qubes-gui-daemon. kloak upstream never needs to be involved, only the functionality of it must be. This functionality I have termed "event buffering", and as this implementation works with X server events I have called it "event buffering", or "ebuf" for short (which is the term used for it in the code). Previously I had called this "X event buffering" and used "xbuf" for short, but as @3hhh pointed out that name would become inaccurate when this is ported to Wayland, so I changed it to "ebuf" so as to make the name be display server agnostic.

By working inside the GUI daemon, the following advantages are gained:

No evdev support needed at all, we can work with X events instead.
The amount of additional code needed is smaller.
Per-VM application and configuration of event buffering is now possible - some VMs can use a small, comfortable delay, others can use a very long one, and others can skip delays entirely.
Even if something goes very wrong and buffering prevents the user from inputting anything into any Qube, the user retains control of dom0 and can recover their system from there.
A compromised domU with event buffering enabled will be less likely to leak valuable biometric info to the malware within the VM.

How it works

Most of the code should be fairly self-explanatory. In a nutshell, we use a tail queue to store a list of delayed, scheduled X events. As events come from dom0 to a domU, they are captured, scheduled for release at a later time, and thrown into the queue. Events in the queue are regularly checked to see if their scheduled release time has arrived, and those events are released when appropriate. The scheduler inserts some random noise into the delays, making it difficult to uniquely identify the user's typing and mouse movement/usage patterns.

By default, event buffering is disabled and all events are passed through without buffering. To enable it, one must use qvm-features to set gui-ebuf-max-delay to a value greater than 0. It is worth noting that 0 is interpreted not as a "don't add any delay when buffering events", but rather it is interpreted as "don't buffer events at all". This configuration feature does not work without the ebuf_max_events setting being added to the list of GUI daemon configuration settings in qubes-core-admin-client. The pull request for that is at QubesOS/qubes-core-admin-client#309.

This PR needs more testing (especially on Qubes R4.3), but it is solid enough that I feel comfortable asking for a review on it. Thanks for your help!

DemiMarie · 2024-10-01T19:06:01Z

Please use getrandom() instead of ISAAC.

ArrayBolt3 · 2024-10-01T19:12:53Z

@DemiMarie I can do that, but that will drain the system's entropy sources at a fairly constant rate (every mouse movement, keystroke, etc. will result in entropy drain). Theoretically that could weaken any automatically generated encryption keys used for things like HTTPS and the like. Using ISAAC requires only a single initial use of system entropy.

If entropy drain isn't a concern, then I'm happy to swap it out.

ArrayBolt3 · 2024-10-01T21:53:40Z

ISAAC removed, getrandom()-based delay mechanism implemented.

I also tracked down the source of the GUI freeze bug - turned out to be because of gui-common/txrx-vchan.c:wait_for_vchan_or_argfd.

int wait_for_vchan_or_argfd(libvchan_t *vchan, int fd) 
{
    int ret;
    while ((ret=wait_for_vchan_or_argfd_once(vchan, fd)) == 0);
    return ret;
}

This was apparently busy-waiting for something to happen and thus keeping queued events from ever being released until the user did something like press a key or move the mouse. I used a hack to make wait_for_vchan_or_argfd non-blocking, but I'm not totally sure that's going to be acceptable in the long run since this will probably significantly increase the CPU usage of qubes-guid. If this is an acceptable solution, wait_for_vchan_or_argfd should just be removed and the underlying function wait_for_vchan_or_argfd_once should be made public so that xside.c can use it directly.

My commit also attempts to implement configuration support - I don't see any reason why the code shouldn't work, but I haven't yet gotten it to work on my machine as I seem to be having trouble setting a custom VM feature properly.

DemiMarie · 2024-10-02T23:42:31Z

@DemiMarie I can do that, but that will drain the system's entropy sources at a fairly constant rate (every mouse movement, keystroke, etc. will result in entropy drain). Theoretically that could weaken any automatically generated encryption keys used for things like HTTPS and the like. Using ISAAC requires only a single initial use of system entropy.

If entropy drain isn't a concern, then I'm happy to swap it out.

Entropy drain doesn’t actually exist. The entropy obtained from the getrandom() syscall cannot be used to derive the internal state of the Linux CSPRNG.

ArrayBolt3 · 2024-10-04T22:53:36Z

I believe this is now ready for review.

ArrayBolt3 · 2024-10-04T23:08:05Z

Pull request for configuring X event buffering: QubesOS/qubes-core-admin-client#309

gui-common/txrx-vchan.c

gui-daemon/xside.c

3hhh · 2024-10-12T15:43:57Z

It's nice to see you work on this @ArrayBolt3, much appreciated!

I also believe that this may be relatively easy to port to Wayland.
@DemiMarie: What do you think?

DemiMarie · 2024-10-15T23:55:21Z

For Wayland the tricky part is that the protocol is written assuming that events are dispatched in-order. Buffering events therefore creates a risk of head-of-line blocking of events that really should not be delayed, like buffer release events indicating that memory can safely be reused. Avoiding this requires reordering events, but that requires effort to ensure no bugs are introduced.

marmarek · 2024-10-16T00:23:23Z

For Wayland the tricky part is that the protocol is written assuming that events are dispatched in-order. Buffering events therefore creates a risk of head-of-line blocking of events that really should not be delayed, like buffer release events indicating that memory can safely be reused. Avoiding this requires reordering events, but that requires effort to ensure no bugs are introduced.

Reordering is problematic for X11 version too (see earlier comments). What events are problematic if delayed few hundreds of milliseconds? Will such delay of buffer release events cause practical issues beyond possibly requiring a bit more memory on the VM side?

ArrayBolt3 · 2024-10-16T00:34:34Z

I wouldn't expect delay to be a problem for X11 since, as I mentioned in one of the earlier comments, the X protocol is designed to operate over a network (for instance SSH with X tunneling). Networks incur a decent amount of latency. That latency causes delays similar to the ones this PR artificially introduces.

Wayland is a concern, however in practice waypipe exists and is used to provide network transparency to Wayland in a fashion similar to X11, and it appears to work well from what I've heard. That would cause the same latency and delay there, so if it's not a problem there it probably won't be a problem here.

gui-daemon/xside.c

gui-daemon/guid.conf

ArrayBolt3 · 2024-10-16T16:16:06Z

Next iteration ready for review. Smoke-tested on Qubes OS R4.3, all requested changes implemented.

ArrayBolt3 · 2024-10-16T16:21:30Z

Had to force-push again because I forgot to update the commit message.

marmarek · 2024-10-18T00:53:10Z

ebuf in code is fine, but for the setting name is rather cryptic. What about "events_delay_max" (or "events_max_delay"?) for the setting name? Or maybe something hinting at the purpose of the delay, like "input_cloak_max_delay"?

ArrayBolt3 · 2024-10-18T01:17:52Z

events_max_delay would work for me, I'll integrate that.

The user may wish to prevent biometric information about their mouse and keyboard patterns from leaking into certain Qubes. To make this possible, this feature inserts random noise into the delivery timing of all X events, making it more difficult to distinguish the user from other X event buffering users. The maximum delay in event delivery is user-configurable through the "events_max_delay" configuration option.

ArrayBolt3 · 2024-10-20T05:44:37Z

Switched to using events_max_delay terminology, also rebased onto the tip of main.

HW42

By working inside the GUI daemon, the following advantages are gained:

No evdev support needed at all, we can work with X events instead.

Getting rid of having multiple input paths that need to be tested would be nice.

A compromised domU with event buffering enabled will be less likely to leak valuable biometric info to the malware within the VM.

If someone compromised a VM enough to get to the raw event stream, they already have a lot of fingerprinting options. So I'm not sure how much of an advantage that really is.

At the same time this approach has a significant cost. It makes the security critical side of the gui handling more complex. To be fair the implementation is not that complicated.

HW42 · 2024-10-23T21:28:41Z

gui-daemon/xside.c


    if (!ghandles.nofork) {
        // daemonize...
        if (pipe(pipe_notify) < 0) {
-            perror("canot create pipe:");
+            perror("cannot create pipe:");


Nitpick: Including such an unrelated typo fix in a PR is ok, but belongs into a separate commit.

HW42 · 2024-10-23T21:29:28Z

gui-daemon/xside.c

@@ -4237,11 +4333,23 @@ static void parse_vm_config(Ghandles * g, config_setting_t * group)
            g->disable_override_redirect = 0;
        else {
            fprintf(stderr,
-                    "unsupported value ‘%s’ for override_redirect (must be ‘disabled’ or ‘allow’\n",
+                    "unsupported value '%s' for override_redirect (must be 'disabled' or 'allow')\n",


Nitpick: Including such an unrelated small change in a PR is ok, but belongs into a separate commit.

HW42 · 2024-10-23T21:31:00Z

gui-daemon/xside.c

+}
+
+/* get random delay value */
+static uint32_t ebuf_random_delay(Ghandles * g, uint32_t lower_bound)


Given that the function is otherwise self-contained, I think it would be better to pass the upper bound instead of the full Ghandles.

HW42 · 2024-10-23T22:38:33Z

gui-daemon/xside.c

+    }
+
+    maxval = g->ebuf_max_delay - lower_bound + 1;
+


Nitpick: trainling whitespace (for some reason GitHub's web UI doesn't show it, but it's in commit 541c10d?)

HW42 · 2024-10-24T00:16:38Z

gui-daemon/xside.c

+        perror("Could not allocate ebuf_entry:");
+        exit(1);
+    }
+    if (current_time > 0 && random_delay > (LONG_MAX - current_time)) {


Suggested change

if (current_time > 0 && random_delay > (LONG_MAX - current_time)) {

if (current_time > 0 && random_delay > (INT64_MAX - current_time)) {

HW42 · 2024-10-24T00:47:22Z

gui-daemon/xside.c

+
+    if (lower_bound >= g->ebuf_max_delay) {
+        fprintf(stderr,
+                "Bug detected - lower_bound >= g->ebuf_max_delay, events may get briefly stuck");


This is reachable if 2 events arrive within the same ms and the ebuf_random_delay generates e_buf_max_delay by random for the first event.

HW42 · 2024-10-24T00:48:31Z

gui-daemon/xside.c

+    uint32_t lower_bound;
+
+    current_time = ebuf_current_time_ms();
+    lower_bound = min(max(g->ebuf_prev_release_time - current_time, 0), g->ebuf_max_delay);


This deserves some explanation. Why do you make the random delay dependent on the previous random delay value?

HW42 · 2024-10-24T00:52:23Z

gui-daemon/xside.c

+        randsize = getrandom(ebuf_rand_data.raw, sizeof(uint32_t), 0);
+        if (randsize != sizeof(uint32_t))
+            continue;
+    } while (ebuf_rand_data.val >= UINT32_MAX - (UINT32_MAX % maxval));


I think this check is correct. But it unnecessarily throws away value if 2**32 is a multiple of maxval. That was a bit confusing during review.

ArrayBolt3 marked this pull request as draft October 1, 2024 05:53

adrelanos mentioned this pull request Oct 1, 2024

Kloak in dom0 QubesOS/qubes-issues#8541

Open

ArrayBolt3 force-pushed the main branch from 2ea0579 to 41789ec Compare October 1, 2024 14:16

ArrayBolt3 force-pushed the main branch from 455a406 to 1510ac9 Compare October 4, 2024 22:47

ArrayBolt3 marked this pull request as ready for review October 4, 2024 22:53

ArrayBolt3 mentioned this pull request Oct 4, 2024

Add events_max_delay setting for qubes-gui-daemon QubesOS/qubes-core-admin-client#309

Open

ArrayBolt3 mentioned this pull request Oct 8, 2024

Security review of tirdad source code 0xsirus/tirdad#23

Closed

marmarek requested changes Oct 10, 2024

View reviewed changes

ArrayBolt3 force-pushed the main branch from 1510ac9 to f9cf8dd Compare October 11, 2024 21:36

marmarek added openqa-pending and removed openqa-pending labels Oct 13, 2024

ArrayBolt3 force-pushed the main branch from f9cf8dd to daa9623 Compare October 15, 2024 23:18

ArrayBolt3 requested a review from marmarek October 15, 2024 23:31

DemiMarie reviewed Oct 16, 2024

View reviewed changes

gui-daemon/xside.c Outdated Show resolved Hide resolved

gui-daemon/xside.c Outdated Show resolved Hide resolved

gui-daemon/xside.c Outdated Show resolved Hide resolved

gui-daemon/xside.c Outdated Show resolved Hide resolved

gui-daemon/xside.c Outdated Show resolved Hide resolved

3hhh reviewed Oct 16, 2024

View reviewed changes

gui-daemon/guid.conf Outdated Show resolved Hide resolved

gui-daemon/guid.conf Outdated Show resolved Hide resolved

ArrayBolt3 force-pushed the main branch from daa9623 to edd9f2b Compare October 16, 2024 16:15

ArrayBolt3 changed the title ~~Add X event buffering for cloaking user input patterns~~ Add event buffering for cloaking user input patterns Oct 16, 2024

ArrayBolt3 force-pushed the main branch from edd9f2b to 34f9787 Compare October 16, 2024 16:21

ArrayBolt3 force-pushed the main branch from 34f9787 to cd9e482 Compare October 20, 2024 05:40

ArrayBolt3 force-pushed the main branch from cd9e482 to 541c10d Compare October 20, 2024 05:44

HW42 reviewed Oct 24, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add event buffering for cloaking user input patterns #149

Add event buffering for cloaking user input patterns #149

ArrayBolt3 commented Oct 1, 2024 •

edited

Loading

DemiMarie commented Oct 1, 2024

ArrayBolt3 commented Oct 1, 2024

ArrayBolt3 commented Oct 1, 2024

DemiMarie commented Oct 2, 2024

ArrayBolt3 commented Oct 4, 2024

ArrayBolt3 commented Oct 4, 2024

3hhh commented Oct 12, 2024

DemiMarie commented Oct 15, 2024

marmarek commented Oct 16, 2024

ArrayBolt3 commented Oct 16, 2024

ArrayBolt3 commented Oct 16, 2024

ArrayBolt3 commented Oct 16, 2024

marmarek commented Oct 18, 2024

ArrayBolt3 commented Oct 18, 2024

ArrayBolt3 commented Oct 20, 2024

HW42 left a comment

HW42 Oct 23, 2024

HW42 Oct 23, 2024

HW42 Oct 23, 2024

HW42 Oct 23, 2024

HW42 Oct 24, 2024

HW42 Oct 24, 2024

HW42 Oct 24, 2024

HW42 Oct 24, 2024

	if (current_time > 0 && random_delay > (LONG_MAX - current_time)) {
	if (current_time > 0 && random_delay > (INT64_MAX - current_time)) {

Add event buffering for cloaking user input patterns #149

Are you sure you want to change the base?

Add event buffering for cloaking user input patterns #149

Conversation

ArrayBolt3 commented Oct 1, 2024 • edited Loading

Goal

TODOs

Fixed TODOs:

Rationale

How it works

DemiMarie commented Oct 1, 2024

ArrayBolt3 commented Oct 1, 2024

ArrayBolt3 commented Oct 1, 2024

DemiMarie commented Oct 2, 2024

ArrayBolt3 commented Oct 4, 2024

ArrayBolt3 commented Oct 4, 2024

3hhh commented Oct 12, 2024

DemiMarie commented Oct 15, 2024

marmarek commented Oct 16, 2024

ArrayBolt3 commented Oct 16, 2024

ArrayBolt3 commented Oct 16, 2024

ArrayBolt3 commented Oct 16, 2024

marmarek commented Oct 18, 2024

ArrayBolt3 commented Oct 18, 2024

ArrayBolt3 commented Oct 20, 2024

HW42 left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ArrayBolt3 commented Oct 1, 2024 •

edited

Loading