soundness improvements around hypervisor-shared memory #451

Freax13 · 2024-08-30T07:17:16Z

This PR improves the soundness of code around hypervisor-shared memory.

The first patch, 174274d, is blocked on google/zerocopy#1601. Let me know if you want me to drop that patch if we don't want to wait on a new zerocopy release. I used the following patch to override zerocopy for testing:

diff --git a/Cargo.toml b/Cargo.toml
index b7bdb46..d87444d 100644
--- a/Cargo.toml
+++ b/Cargo.toml
@@ -49,6 +49,9 @@ zerocopy = { version = "0.7.32", features = ["alloc", "derive"] }
 # other repos
 packit = { git = "https://github.com/coconut-svsm/packit", version = "0.1.1" }
 
+[patch.crates-io]
+zerocopy = { git = "https://github.com/Freax13/zerocopy.git", rev = "68e1cc8" }
+
 [workspace.lints.rust]
 future_incompatible = { level = "deny", priority = 127 }
 nonstandard_style = { level = "deny", priority = 126 }

joergroedel · 2024-09-11T15:19:14Z

This PR improves the soundness of code around hypervisor-shared memory.

The first patch, 174274d, is blocked on google/zerocopy#1601. Let me know if you want me to drop that patch if we don't want to wait on a new zerocopy release.

Please move that patch to a separate draft-PR, which you can then "undraft" once all blockers are solved.

In general I like these changes, especially the SharedBox implementation. That will simplify a lot of things.

Once updated this needs testing by @msft-jlange and possibly also a review by @cclaudio .

Freax13 · 2024-09-12T09:22:56Z

Please move that patch to a separate draft-PR, which you can then "undraft" once all blockers are solved.

Done.

This makes it possible to implement get_aad_slice without any unsafe code. Signed-off-by: Tom Dohrmann <[email protected]>

Given that the hypervisor has write access to that memory, we need to treat the memory as interiorly mutable. Signed-off-by: Tom Dohrmann <[email protected]>

Freax13 · 2024-09-18T09:04:18Z

Just rebased onto main. I resolved the TODOs by switching to the functions in crate::cpu::mem.

msft-jlange · 2024-09-19T18:23:20Z

kernel/src/stage2.rs

-    this_cpu()
-        .shutdown()
-        .expect("Failed to shut down percpu data (including GHCB)");
+unsafe fn shutdown_percpu() {


Why is this function unsafe? At a minimum, there should be a Safety comment here. But there is also nothing about the function declaration that suggests that it can only be called from unsafe code. My understanding of the convention we have been using is that a function should only be declared unsafe if it is not possible to call it from safe code due to its parameters or return value (this was the position advocated by @00xc as I always understood it).

Why is this function unsafe? At a minimum, there should be a Safety comment here.

I added a safety section.

But there is also nothing about the function declaration that suggests that it can only be called from unsafe code. My understanding of the convention we have been using is that a function should only be declared unsafe if it is not possible to call it from safe code due to its parameters or return value (this was the position advocated by @00xc as I always understood it).

No, that's not how unsafe is used in Rust: A function must be marked as unsafe if (and usually only if) safe code using the function can cause undefined behavior (see the Rustonomicon). Whether or not a function must be marked as unsafe has very little to do with its signature and much more to do with what it actually does (you could make any unsafe function with any signature safe by replacing it's body with loop {}). In this case, the caller must ensure that the PerCpu instance isn't used after it's been dropped because doing so is UB. This isn't something that the compiler can enforce, so we have to mark the function as unsafe and shift the responsibility of upholding this to the user. This isn't a convention that's specified by us in this project, those are the rules for writing unsafe code in Rust.

I agree with your perspective on the use of unsafe. However, I once submitted a PR to this project that declared a function unsafe for similar reasons, and received feedback from @00xc that the function should not be declared this way, for the reasons I included in my earlier comment. I'm happy with your approach now that you've added a safety comment, and will leave it to @00xc to comment on whether this use of unsafe is appropriate.

msft-jlange · 2024-09-19T18:24:45Z

kernel/src/cpu/mem.rs

+/// This function has all the safety requirements of `core::ptr::read` except
+/// that data races are explicitly permitted.
+#[inline(always)]
+pub unsafe fn is_clear(src: usize, size: usize) -> bool {


This appears to be used only by test code. Should it therefore be within the test module so it is not accidentally referenced by non-test code?

This appears to be used only by test code.

This function is also referenced by SharedBox::is_clear which itself is used by send_extended_guest_request.

If this function is required for production code, then why is it written in assembly? In the interest of minimizing the use of unsafe code, it seems like we don't want to write anything in assembly unless it has to be done that way, and at a glance it appears that this function could be implemented in Rust instead of asm. If you have a strong justification for the use of assembly, then can you include that justification in the safety comments?

If performance is your primary concern, then there are many ways this loop could be improved (switching to SCASQ at a minimum, and better yet would be to use XMM). But if performance is not a primary concern, then avoiding assembly altogether would be best.

Performance is not my primary concern.

The problem is that Rust doesn't have atomic volatile read/write ops. I tried to hint at this in the safety section by mentioning that synchronization is not required. Using inline assembly was one of the work-arounds suggested in rust-lang/unsafe-code-guidelines#321.

Since you're going with unsafe anyway, why not take a raw pointer as the source, which you can then coerce into a pointer to AtomicU64 (or some other size based on the alignment of size). Then you have an atomic volatile read op in the form of AtomicU64::load(Ordering::Relaxed). I find this much better than using assembly, especially given that we want to port this to non-x86 platforms in the future (like Arm).

And note also that Ordering applies only to memory fences, controlling the order in which the microarchitecture is allowed to process loads and stores. It does not apply to any assumptions the compiler can make about multi-thread consistency.

IIUC the compiler is allowed to just return 1 because it knows that the load was immediately preceded by a store to the same address with the value 1. This wouldn't be allowed for a volatile access, but for non-volatile atomics, AFAICT this is legal.

Not in this case. Because atomics have Sync and Send behavior, the compiler cannot assume that another thread did not modify the data in between, and thus the load must be performed in case the value changed asynchronously. The whole point of the atomic load is to capture the data as it exists now, relative to other threads, so the load can never be omitted based on data elision.

Can you point to any standard or other source that says "Atomics loads must capture the data as it exists now" or in other words "omitting loads after a store is not allowed"? The only thing I was able to find so far is an old article from 2009 written by Herb Sutter that says that this kind of optimization is allowed (see the last example here) though the rules for atomics might have changed since then :-/

And note also that Ordering applies only to memory fences, controlling the order in which the microarchitecture is allowed to process loads and stores. It does not apply to any assumptions the compiler can make about multi-thread consistency.

I don't believe that's true, I believe that the compiler is allowed to do optimizations on atomics op including re-ordering. The C++ docs for std::memory_order mention compiler reordering:

In particular, this may occur if D is completed before C in thread 2, either due to compiler reordering or at runtime.

(this quote talks about two relaxed atomic operations on the same thread)

Can you point to any standard or other source that says "Atomics loads must capture the data as it exists now" or in other words "omitting loads after a store is not allowed"?

This appears to be very poorly specified. The LLVM atomics guide for unordered operations (https://llvm.org/docs/Atomics.html#unordered) is vague - it prevents certain optimizations with respect to stores, and it permits certain load optimizations that do not involve dropping loads, but it doesn't address this particular case as being either allowed or unallowed. I will note that the codegen portion of the atomics guide (https://llvm.org/docs/Atomics.html#atomics-and-codegen) states that on x86, "all atomic loads generate a MOV". I interpret this to mean that all atomic loads result in a memory access, not just those that weren't optimized away, but it appears to be subject to interpretation.

And note also that Ordering applies only to memory fences, controlling the order in which the microarchitecture is allowed to process loads and stores. It does not apply to any assumptions the compiler can make about multi-thread consistency.

I don't believe that's true, I believe that the compiler is allowed to do optimizations on atomics op including re-ordering. The C++ docs for std::memory_order mention compiler reordering:

The LLVM guide appears to agree with your point of view, in that ordering requirements are permitted to affect the order of generated code and not just the memory fences.

If you really want to err on the side of safety, and assume the most unfavorable interpretation of the codegen guide, then it would be safe to preface your loop with a memory barrier (which probably doesn't exist as a Rust primitive and probably has to be implemented in platform-specific asm code). Regardless of how the language can support this, either a function call to asm or an explicit memory barrier will put the compiler on notice that cross-thread synchronization is expected.

If you really want to err on the side of safety, and assume the most unfavorable interpretation of the codegen guide, then it would be safe to preface your loop with a memory barrier (which probably doesn't exist as a Rust primitive and probably has to be implemented in platform-specific asm code). Regardless of how the language can support this, either a function call to asm or an explicit memory barrier will put the compiler on notice that cross-thread synchronization is expected.

There's core::sync::atomic::fence and core::sync::atomic::compiler_fence. In our situation, we'd want to use compiler_fence because we're only interested in preventing the compiler from reordering our accesses. Unfortunately, the precise semantics of compiler_fence are unclear.

I removed is_clear.

kernel/src/mm/page_visibility.rs

kernel/src/sev/ghcb.rs

msft-jlange · 2024-09-19T18:32:05Z

kernel/src/sev/ghcb.rs

@@ -160,7 +162,28 @@ impl GhcbPage {

 impl Drop for GhcbPage {
    fn drop(&mut self) {


Is there any scenario for which dropping a GHCB is not associated with a fatal termination of the SVSM? The process of restoring a GHCB page to the private state is fragile, and I believe it would be best to avoid any attempt to do so unless we were aware of a valid use case for this. If we cannot come up with one, then it would be simplest for GhcbPage::drop() simply to panic, since we shouldn't get here anyway.

Is there any scenario for which dropping a GHCB is not associated with a fatal termination of the SVSM? The process of restoring a GHCB page to the private state is fragile, and I believe it would be best to avoid any attempt to do so unless we were aware of a valid use case for this. If we cannot come up with one, then it would be simplest for GhcbPage::drop() simply to panic, since we shouldn't get here anyway.

Yes, calling the shutdown_percpu function will cause the GhcbPage to be dropped. We do this to release the GHCB used by stage2 shortly before entering the svsm kernel. That said, this code wasn't introduced by this PR, this PR just moved that code around.

If we didn't need this, I'd would totally be on your side, we shouldn't write fragile code when we shouldn't need it in practice.

I'll take another look at this flow after your PR merges to see if we can be more robust here, but I accept your point that your PR isn't really changing the logic that exists today.

msft-jlange · 2024-09-19T18:34:03Z

kernel/src/sev/ghcb.rs

    pub fn clear(&self) {
        // Clear valid bitmap
-        self.valid_bitmap.set([0, 0]);
+        self.valid_bitmap[0].store(0, Ordering::SeqCst);


The GHCB page is not manipulated across processors, so there is no reason for expensive memory barriers when modifying its contents. The only race conditions we might expect are between the SVSM environment and the host running on the same processor; this possibility for interruption means that atomic operations are necessary, but Ordering::Relaxed will be sufficient for all such operations. That is true everywhere throughout this file.

Sure, sounds reasonable.

kernel/src/mm/page_visibility.rs

msft-jlange · 2024-09-19T18:42:47Z

kernel/src/sev/hv_doorbell.rs

-    }
+/// Allocates a new HV doorbell page and registers it on the hypervisor
+/// using the given GHCB.
+pub fn allocate_hv_doorbell_page(ghcb: &GHCB) -> Result<&'static HVDoorbell, SvsmError> {


Why does this function not use SharedBox? It seems that SharedBox is designed to do exactly the sort of assignment and visibility management expected here, and unlike the GHCB, there is no chicken-and-egg problem because SharedBox only requires the existence of a GHCB, not of a doorbell page.

Good idea! I changed the function to use SharedBox.

Cell doesn't allow concurrent accesses. This is a problem because we share the memory with the host and the host could write to the memory while we're reading it. Use atomic accesses instead. Atomic accesses can tolerate concurrent writes. Signed-off-by: Tom Dohrmann <[email protected]>

This resolves a TOC-TOU issue. Furthermore we don't need to check the entire content: If the certificate data is not empty, there will be non-zero bytes in the first 24 bytes. Signed-off-by: Tom Dohrmann <[email protected]>

SharedBox is a safe wrapper around memory pages shared with the host. Signed-off-by: Tom Dohrmann <[email protected]>

HVDoorbellPage was only used in one place and leak was immediately called on it. Given that we don't ever need to free up a doorbell page let's just implement this in a single function returning a static reference. Signed-off-by: Tom Dohrmann <[email protected]>

This is better for a couple of reasons: 1. drop_in_place destroys the object rather than mutating it to release resources. The downside with simply mutating but not destroying is that the object still has to be in a valid state and this limits the shutdown code (for example it can't release the memory associated with a PageBox) 2. After the object has been dropped, it can't be accessed anymore. This means that the shutdown code doesn't have to worry about later accesses like the previous code had to. 3. All resources are freed, not just the GHCB. This also fixes a soundness issue where if the shutdown were to be called twice on the same GHCB that would result in a double-pvalidate bug. Signed-off-by: Tom Dohrmann <[email protected]>

This impl is unused. It is also unsound because we can never have unique ownership over the GHCB as long as it is shared with the host. Signed-off-by: Tom Dohrmann <[email protected]>

Now that the shutdown code is only called from the Drop impl we might as well move it in there. This also makes it impossible to call shutdown more than once (or to call shutdown and the Drop the GhcbPage). Signed-off-by: Tom Dohrmann <[email protected]>

Freax13 force-pushed the more-zerocopy branch 3 times, most recently from bfe8671 to a084a9a Compare August 30, 2024 07:24

Freax13 mentioned this pull request Sep 12, 2024

greq: retire new_box_zeroed in favor of FromZeroes derive #454

Draft

Freax13 force-pushed the more-zerocopy branch from a084a9a to 333d1db Compare September 12, 2024 09:22

joergroedel requested review from msft-jlange and cclaudio September 16, 2024 12:23

joergroedel added the wait-for-review PR needs for approval by reviewers label Sep 16, 2024

Freax13 added 2 commits September 18, 2024 06:43

greq: derive AsBytes for SnpGuestRequestMsgHdr

baa6142

This makes it possible to implement get_aad_slice without any unsafe code. Signed-off-by: Tom Dohrmann <[email protected]>

hv_doorbell: wrap reserved_63_4 in UnsafeCell

08bd0e5

Given that the hypervisor has write access to that memory, we need to treat the memory as interiorly mutable. Signed-off-by: Tom Dohrmann <[email protected]>

Freax13 force-pushed the more-zerocopy branch from 333d1db to 7172851 Compare September 18, 2024 09:03

msft-jlange reviewed Sep 19, 2024

View reviewed changes

kernel/src/mm/page_visibility.rs Outdated Show resolved Hide resolved

msft-jlange reviewed Sep 19, 2024

View reviewed changes

kernel/src/sev/ghcb.rs Show resolved Hide resolved

msft-jlange reviewed Sep 19, 2024

View reviewed changes

kernel/src/mm/page_visibility.rs Show resolved Hide resolved

msft-jlange reviewed Sep 19, 2024

View reviewed changes

Freax13 force-pushed the more-zerocopy branch from 7172851 to 9320a20 Compare September 20, 2024 06:44

Freax13 force-pushed the more-zerocopy branch 2 times, most recently from 00e638e to a5acf4a Compare September 20, 2024 08:30

Freax13 requested a review from msft-jlange September 20, 2024 08:34

msft-jlange mentioned this pull request Sep 24, 2024

get rid of unnecessary uses of unsafe #460

Open

Freax13 added 2 commits September 26, 2024 09:15

greq: copy data before checking if it's empty

f05092b

This resolves a TOC-TOU issue. Furthermore we don't need to check the entire content: If the certificate data is not empty, there will be non-zero bytes in the first 24 bytes. Signed-off-by: Tom Dohrmann <[email protected]>

mm: implement SharedBox

48fe3e4

SharedBox is a safe wrapper around memory pages shared with the host. Signed-off-by: Tom Dohrmann <[email protected]>

Freax13 added 4 commits September 26, 2024 09:17

ghcb: remove DerefMut impl

48b9563

This impl is unused. It is also unsound because we can never have unique ownership over the GHCB as long as it is shared with the host. Signed-off-by: Tom Dohrmann <[email protected]>

Freax13 force-pushed the more-zerocopy branch from a5acf4a to f263c5f Compare September 26, 2024 09:18

joergroedel added in-review PR is under active review and not yet approved and removed wait-for-review PR needs for approval by reviewers labels Sep 26, 2024

msft-jlange approved these changes Sep 26, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

soundness improvements around hypervisor-shared memory #451

soundness improvements around hypervisor-shared memory #451

Freax13 commented Aug 30, 2024 •

edited

Loading

joergroedel commented Sep 11, 2024

Freax13 commented Sep 12, 2024

Freax13 commented Sep 18, 2024

msft-jlange Sep 19, 2024

Freax13 Sep 20, 2024

msft-jlange Sep 20, 2024

msft-jlange Sep 19, 2024

Freax13 Sep 20, 2024

msft-jlange Sep 20, 2024

Freax13 Sep 23, 2024

msft-jlange Sep 23, 2024

msft-jlange Sep 25, 2024

Freax13 Sep 25, 2024 •

edited

Loading

msft-jlange Sep 25, 2024

Freax13 Sep 26, 2024

Freax13 Sep 26, 2024

msft-jlange Sep 19, 2024

Freax13 Sep 20, 2024

msft-jlange Sep 20, 2024

msft-jlange Sep 19, 2024 •

edited

Loading

Freax13 Sep 20, 2024

msft-jlange Sep 19, 2024

Freax13 Sep 20, 2024

		@@ -160,7 +162,28 @@ impl GhcbPage {

		impl Drop for GhcbPage {
		fn drop(&mut self) {

soundness improvements around hypervisor-shared memory #451

Are you sure you want to change the base?

soundness improvements around hypervisor-shared memory #451

Conversation

Freax13 commented Aug 30, 2024 • edited Loading

joergroedel commented Sep 11, 2024

Freax13 commented Sep 12, 2024

Freax13 commented Sep 18, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Freax13 Sep 25, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

msft-jlange Sep 19, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Freax13 commented Aug 30, 2024 •

edited

Loading

Freax13 Sep 25, 2024 •

edited

Loading

msft-jlange Sep 19, 2024 •

edited

Loading