Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add event buffering for cloaking user input patterns #149

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

ArrayBolt3
Copy link

@ArrayBolt3 ArrayBolt3 commented Oct 1, 2024

Goal

Implement the functionality of kloak (a tool designed to hide biometric behavior patterns in keystrokes and mouse movements) in qubes-gui-daemon. This PR will implement the functionality requested in QubesOS/qubes-issues#1850 and fleshed out further in QubesOS/qubes-issues#8541. It will also close QubesOS/qubes-issues#8534 as it will no longer be necessary.

TODOs

  • Test rigorously on Qubes R4.3 (an earlier iteration of the code has been smoke-tested on Qubes R4.2, this hasn't been tested at all on R4.3 yet)

Fixed TODOs:

  • Figure out why the domU window occasionally freezes until another input event is sent - we aren't buffering info coming from domU to dom0 so why this is happening is a mystery to me, and something for later investigation. (Solved, Add event buffering for cloaking user input patterns #149 (comment))
  • Potentially change how events are treated (do some events have to operate in pairs for best results?). (Lots of X events are now not buffered in the latest implementation. Only ones that look valuable to buffer are buffered.)
  • Make the delay duration user-configurable (right now it's hardcoded to 150 milliseconds). (Implemented.)
  • Allow configuring event delay duration for individual VMs buffering (right now it is applied equally to all VMs). (Implemented.)
  • Get the configuration code working and test it. (Solved, this ended up requiring a change to core-admin-client which I will be submitting as a separate PR.)
  • Ensure all new code adheres to Qubes OS standards (didn't have time to finish that up) (should be done now)

Rationale

Kloak, the inspiration for this PR, is a user input buffering and obfuscation tool. It intercepts keyboard and mouse events at the evdev layer, holds them in a queue for release at a later scheduled time, then releases them to the applications they were intended for periodically. By adding random noise into the user's input patterns, kloak aims to make otherwise recognizable patterns in user behavior (such as keystroke rhythm and mouse movement patterns) too erratic to be used as a method of identifying the user. This is potentially very useful especially for Whonix Workstation domUs, as it denies an adversary access to a remarkably effective biometric fingerprinting mechanism they could otherwise access without specialized tools.

Kloak is currently able to operate directly in Qubes domUs if (and only if!) gui-agent-virtual-input-device is enabled for the domU in question. Even in these instances, only keyboard events are anonymized, and additionally the domU must have an evdev X driver installed. This is less than ideal from a functionality standpoint, and as @DemiMarie has explained in QubesOS/qubes-issues#8541 it will eventually stop working entirely. There's also the possibility of malware compromise in the domU resulting in the deanonymization of the user. For these reasons, enabling the use of evdev in domUs and running kloak in the domU is not a good solution.

The other obvious option is to run kloak directly in dom0. This has several disadvantages:

  • kloak can now potentially wreak havoc on the user's ability to use their computer. If a bug in kloak locks up the keyboard, or the user does something inadvisable like setting a 20-second event delay, regaining control of the system could be difficult or impossible without doing a hard reset (or worse, booting an external USB in order to chroot into dom0 and disable kloak).
  • Application of kloak's functionality becomes all-or-nothing - you either anonymize all keyboard and mouse input everywhere, or you anonymize none of it. This could make management of dom0 annoying with larger delay times, and it could prevent the user from making use of applications or websites that require input pattern telemetry to function (such as some bank websites).
  • kloak's configuration options similarly apply globally. One might want a comfortable delay of only 25ms in a domU they expect to be safe, but wish to use an extremely long one like 1000ms in a domU they believe is compromised and actively exfiltrating data. With kloak running directly in dom0, this is impossible.

This PR implements a third option - inserting the functionality of kloak directly into qubes-gui-daemon. kloak upstream never needs to be involved, only the functionality of it must be. This functionality I have termed "event buffering", and as this implementation works with X server events I have called it "event buffering", or "ebuf" for short (which is the term used for it in the code). Previously I had called this "X event buffering" and used "xbuf" for short, but as @3hhh pointed out that name would become inaccurate when this is ported to Wayland, so I changed it to "ebuf" so as to make the name be display server agnostic.

By working inside the GUI daemon, the following advantages are gained:

  • No evdev support needed at all, we can work with X events instead.
  • The amount of additional code needed is smaller.
  • Per-VM application and configuration of event buffering is now possible - some VMs can use a small, comfortable delay, others can use a very long one, and others can skip delays entirely.
  • Even if something goes very wrong and buffering prevents the user from inputting anything into any Qube, the user retains control of dom0 and can recover their system from there.
  • A compromised domU with event buffering enabled will be less likely to leak valuable biometric info to the malware within the VM.

How it works

Most of the code should be fairly self-explanatory. In a nutshell, we use a tail queue to store a list of delayed, scheduled X events. As events come from dom0 to a domU, they are captured, scheduled for release at a later time, and thrown into the queue. Events in the queue are regularly checked to see if their scheduled release time has arrived, and those events are released when appropriate. The scheduler inserts some random noise into the delays, making it difficult to uniquely identify the user's typing and mouse movement/usage patterns.

By default, event buffering is disabled and all events are passed through without buffering. To enable it, one must use qvm-features to set gui-ebuf-max-delay to a value greater than 0. It is worth noting that 0 is interpreted not as a "don't add any delay when buffering events", but rather it is interpreted as "don't buffer events at all". This configuration feature does not work without the ebuf_max_events setting being added to the list of GUI daemon configuration settings in qubes-core-admin-client. The pull request for that is at QubesOS/qubes-core-admin-client#309.

This PR needs more testing (especially on Qubes R4.3), but it is solid enough that I feel comfortable asking for a review on it. Thanks for your help!

@DemiMarie
Copy link
Contributor

Please use getrandom() instead of ISAAC.

@ArrayBolt3
Copy link
Author

@DemiMarie I can do that, but that will drain the system's entropy sources at a fairly constant rate (every mouse movement, keystroke, etc. will result in entropy drain). Theoretically that could weaken any automatically generated encryption keys used for things like HTTPS and the like. Using ISAAC requires only a single initial use of system entropy.

If entropy drain isn't a concern, then I'm happy to swap it out.

@ArrayBolt3
Copy link
Author

ISAAC removed, getrandom()-based delay mechanism implemented.

I also tracked down the source of the GUI freeze bug - turned out to be because of gui-common/txrx-vchan.c:wait_for_vchan_or_argfd.

int wait_for_vchan_or_argfd(libvchan_t *vchan, int fd) 
{
    int ret;
    while ((ret=wait_for_vchan_or_argfd_once(vchan, fd)) == 0);
    return ret;
}

This was apparently busy-waiting for something to happen and thus keeping queued events from ever being released until the user did something like press a key or move the mouse. I used a hack to make wait_for_vchan_or_argfd non-blocking, but I'm not totally sure that's going to be acceptable in the long run since this will probably significantly increase the CPU usage of qubes-guid. If this is an acceptable solution, wait_for_vchan_or_argfd should just be removed and the underlying function wait_for_vchan_or_argfd_once should be made public so that xside.c can use it directly.

My commit also attempts to implement configuration support - I don't see any reason why the code shouldn't work, but I haven't yet gotten it to work on my machine as I seem to be having trouble setting a custom VM feature properly.

@DemiMarie
Copy link
Contributor

@DemiMarie I can do that, but that will drain the system's entropy sources at a fairly constant rate (every mouse movement, keystroke, etc. will result in entropy drain). Theoretically that could weaken any automatically generated encryption keys used for things like HTTPS and the like. Using ISAAC requires only a single initial use of system entropy.

If entropy drain isn't a concern, then I'm happy to swap it out.

Entropy drain doesn’t actually exist. The entropy obtained from the getrandom() syscall cannot be used to derive the internal state of the Linux CSPRNG.

@ArrayBolt3
Copy link
Author

I believe this is now ready for review.

@ArrayBolt3
Copy link
Author

Pull request for configuring X event buffering: QubesOS/qubes-core-admin-client#309

gui-common/txrx-vchan.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
@3hhh
Copy link

3hhh commented Oct 12, 2024

It's nice to see you work on this @ArrayBolt3, much appreciated!

I also believe that this may be relatively easy to port to Wayland.
@DemiMarie: What do you think?

@DemiMarie
Copy link
Contributor

For Wayland the tricky part is that the protocol is written assuming that events are dispatched in-order. Buffering events therefore creates a risk of head-of-line blocking of events that really should not be delayed, like buffer release events indicating that memory can safely be reused. Avoiding this requires reordering events, but that requires effort to ensure no bugs are introduced.

@marmarek
Copy link
Member

For Wayland the tricky part is that the protocol is written assuming that events are dispatched in-order. Buffering events therefore creates a risk of head-of-line blocking of events that really should not be delayed, like buffer release events indicating that memory can safely be reused. Avoiding this requires reordering events, but that requires effort to ensure no bugs are introduced.

Reordering is problematic for X11 version too (see earlier comments). What events are problematic if delayed few hundreds of milliseconds? Will such delay of buffer release events cause practical issues beyond possibly requiring a bit more memory on the VM side?

@ArrayBolt3
Copy link
Author

I wouldn't expect delay to be a problem for X11 since, as I mentioned in one of the earlier comments, the X protocol is designed to operate over a network (for instance SSH with X tunneling). Networks incur a decent amount of latency. That latency causes delays similar to the ones this PR artificially introduces.

Wayland is a concern, however in practice waypipe exists and is used to provide network transparency to Wayland in a fashion similar to X11, and it appears to work well from what I've heard. That would cause the same latency and delay there, so if it's not a problem there it probably won't be a problem here.

gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/xside.c Outdated Show resolved Hide resolved
gui-daemon/guid.conf Outdated Show resolved Hide resolved
gui-daemon/guid.conf Outdated Show resolved Hide resolved
@ArrayBolt3
Copy link
Author

Next iteration ready for review. Smoke-tested on Qubes OS R4.3, all requested changes implemented.

@ArrayBolt3 ArrayBolt3 changed the title Add X event buffering for cloaking user input patterns Add event buffering for cloaking user input patterns Oct 16, 2024
@ArrayBolt3
Copy link
Author

Had to force-push again because I forgot to update the commit message.

@marmarek
Copy link
Member

ebuf in code is fine, but for the setting name is rather cryptic. What about "events_delay_max" (or "events_max_delay"?) for the setting name? Or maybe something hinting at the purpose of the delay, like "input_cloak_max_delay"?

@ArrayBolt3
Copy link
Author

events_max_delay would work for me, I'll integrate that.

The user may wish to prevent biometric information about their mouse
and keyboard patterns from leaking into certain Qubes. To make this
possible, this feature inserts random noise into the delivery timing of
all X events, making it more difficult to distinguish the user from
other X event buffering users. The maximum delay in event delivery is
user-configurable through the "events_max_delay" configuration option.
@ArrayBolt3
Copy link
Author

Switched to using events_max_delay terminology, also rebased onto the tip of main.

Copy link
Contributor

@HW42 HW42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By working inside the GUI daemon, the following advantages are gained:

  • No evdev support needed at all, we can work with X events instead.

Getting rid of having multiple input paths that need to be tested would be nice.

  • A compromised domU with event buffering enabled will be less likely to leak valuable biometric info to the malware within the VM.

If someone compromised a VM enough to get to the raw event stream, they already have a lot of fingerprinting options. So I'm not sure how much of an advantage that really is.

At the same time this approach has a significant cost. It makes the security critical side of the gui handling more complex. To be fair the implementation is not that complicated.


if (!ghandles.nofork) {
// daemonize...
if (pipe(pipe_notify) < 0) {
perror("canot create pipe:");
perror("cannot create pipe:");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Including such an unrelated typo fix in a PR is ok, but belongs into a separate commit.

@@ -4237,11 +4333,23 @@ static void parse_vm_config(Ghandles * g, config_setting_t * group)
g->disable_override_redirect = 0;
else {
fprintf(stderr,
"unsupported value ‘%s’ for override_redirect (must be disabled or allow\n",
"unsupported value '%s' for override_redirect (must be 'disabled' or 'allow')\n",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: Including such an unrelated small change in a PR is ok, but belongs into a separate commit.

}

/* get random delay value */
static uint32_t ebuf_random_delay(Ghandles * g, uint32_t lower_bound)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that the function is otherwise self-contained, I think it would be better to pass the upper bound instead of the full Ghandles.

}

maxval = g->ebuf_max_delay - lower_bound + 1;

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nitpick: trainling whitespace (for some reason GitHub's web UI doesn't show it, but it's in commit 541c10d?)

perror("Could not allocate ebuf_entry:");
exit(1);
}
if (current_time > 0 && random_delay > (LONG_MAX - current_time)) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if (current_time > 0 && random_delay > (LONG_MAX - current_time)) {
if (current_time > 0 && random_delay > (INT64_MAX - current_time)) {


if (lower_bound >= g->ebuf_max_delay) {
fprintf(stderr,
"Bug detected - lower_bound >= g->ebuf_max_delay, events may get briefly stuck");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is reachable if 2 events arrive within the same ms and the ebuf_random_delay generates e_buf_max_delay by random for the first event.

uint32_t lower_bound;

current_time = ebuf_current_time_ms();
lower_bound = min(max(g->ebuf_prev_release_time - current_time, 0), g->ebuf_max_delay);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This deserves some explanation. Why do you make the random delay dependent on the previous random delay value?

randsize = getrandom(ebuf_rand_data.raw, sizeof(uint32_t), 0);
if (randsize != sizeof(uint32_t))
continue;
} while (ebuf_rand_data.val >= UINT32_MAX - (UINT32_MAX % maxval));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this check is correct. But it unnecessarily throws away value if 2**32 is a multiple of maxval. That was a bit confusing during review.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

enable qvm-service gui-agent-virtual-input-device for Whonix-Workstation App Qubes by default
5 participants