Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for vectorized read with libc::readv #4084

Draft
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

shamb0
Copy link
Contributor

@shamb0 shamb0 commented Dec 10, 2024

Fixes #4048

Add POSIX readv() System Call Support to MIRI

This PR implements emulation support for libc::readv(), a POSIX vectored I/O operation that allows reading data into multiple buffers with a single system call. This enhancement addresses issue #4048.

@shamb0 shamb0 force-pushed the support-libc-readv-writev branch 2 times, most recently from 195ddfe to 144a986 Compare December 10, 2024 16:45
Copy link
Member

@RalfJung RalfJung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll assume this should not be reviewed yet since it is marked as a draft. I just noticed this one small thing. ;)

I'd recommend to make a first PR with just readv. Smaller PRs are always easier to handle. Please write @rustbot ready and remove the "draft" status once you want a review.

let [fd, iov, iovcnt] = this.check_shim(abi, ExternAbi::C { unwind: false }, link_name, args)?;
let fd = this.read_scalar(fd)?.to_i32()?;
let iovcnt = this.read_scalar(iovcnt)?.to_i32()?;
this.readv(fd, iov, iovcnt as _, None, dest)?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please don't use as casts. Why are you casting here anyway?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @RalfJung,

Thank you for pointing this out—you're absolutely right that the casting is unnecessary. I appreciate your feedback.

As you noted, the PR is still in draft status, and there are some cleanups I plan to address. I shared it at this stage primarily to seek input, particularly regarding the issue we previously discussed here: Debugging Scalar Size Mismatch Error in Rust Miri.

Thank you again for reviewing and sharing your thoughts! :)

@RalfJung
Copy link
Member

As you noted, the PR is still in draft status, and there are some cleanups I plan to address. I shared it at this stage primarily to seek input, particularly regarding the issue we previously discussed here: Debugging Scalar Size Mismatch Error in Rust Miri.

Ah, that was not clear at all. Please note things like that in the PR description.

Anyway judging from Zulip it seems you made progress there. I'm going to wait until you ask for further input on this PR. :)

@shamb0 shamb0 force-pushed the support-libc-readv-writev branch from 144a986 to 279090c Compare December 17, 2024 09:47
@shamb0
Copy link
Contributor Author

shamb0 commented Dec 17, 2024

Hi @RalfJung,

I've completed the initial implementation of readv() support and would appreciate your review. The changes focus on three main areas:

  1. Core Implementation

    • Added readv() emulation in src/shims/unix/fd.rs
    • Designed to comply with POSIX vectored I/O specifications
  2. Safety Enhancement

    • Introduced new deref_pointer_and_offset_vectored() function in src/helpers.rs
    • Implements specialized array bounds checking for vectored I/O operations
  3. Comprehensive Testing

    • Created test suite in tests/pass-dep/libc/libc-fs.rs covering:
      • Large buffer operations with pattern verification (64KB total)
      • Partial reads and EOF handling
      • Error condition validation
    • All tests pass with MIRIFLAGS=-Zmiri-disable-isolation

Looking forward to your feedback, particularly on the new array bounds checking approach and test coverage strategy. :)

@shamb0 shamb0 marked this pull request as ready for review December 17, 2024 10:03
@shamb0
Copy link
Contributor Author

shamb0 commented Dec 17, 2024

@rustbot ready

@rustbot rustbot added the S-waiting-on-review Status: Waiting for a review to complete label Dec 17, 2024
@shamb0 shamb0 changed the title Add support for vectorized read/write with libc::readv and libc::writev Add support for vectorized read/ with libc::readv Dec 17, 2024
@shamb0 shamb0 changed the title Add support for vectorized read/ with libc::readv Add support for vectorized read with libc::readv Dec 17, 2024
src/helpers.rs Outdated
/// where we need to access multiple independent memory regions, such as when processing an array
/// of iovec structures. Unlike simple pointer arithmetic bounds checking, this implementation
/// understands and validates array-based access patterns.
fn deref_pointer_and_offset_vectored(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a very specific operation for a particular Linux/POSIX API. I don't think we need a global helper for this, so please move it into the only file where it is actually used.

It's also completely unclear to me why you need a complicated new check here. readv just iterates the array and does a read call for each element, right?

Copy link
Contributor Author

@shamb0 shamb0 Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Thank you for your review comment regarding the placement and necessity of the deref_pointer_and_offset_vectored function. I'd like to explain the reasoning behind this implementation and agree with your suggestion about its placement.

  • The implementation of readv() requires handling a base pointer to an array of iovec structures (iov_ptr: &OpTy<'tcx>) along with a count (iovcnt: i32). When reconstructing array elements from this base pointer, I encountered limitations with the existing deref_pointer_and_offset() API, specifically triggering assertions that weren't suitable for our array-based access pattern.

  • You can find the complete MIRI backtrace demonstrating these limitations here: https://gist.github.com/shamb0/01491a8367c78404769b3e210ec860e0

image

image

  • While the boundary checking logic may appear complex, it's necessary for safely handling array reconstruction in this context. However, I agree that this functionality is specific to the readv implementation and doesn't need to be a global helper. I've moved the function to src/shims/unix/fd.rs where it's being used, making its purpose and context clearer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest you just call deref_pointer_as on the iov_ptr. The key is to construct the right pointee type: make it an array of libc::iovec pointee type, with length iovcnt. You do this once before the loop.

Then inside the loop, you can use project_index to access the n-th element of the array.

fd_num: i32,
iov_ptr: &OpTy<'tcx>,
iovcnt: i32,
offset: Option<i128>,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems to be always None, so please remove the argument.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I’ve removed the offset parameter from the readv function signature as requested.

  • Initially, I had included it to align with the parameter structure of the existing read and write functions, given their similarity to file operations. Let me know if there’s anything else you’d like adjusted!"

image

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

offset will be required in the future for preadv, but let's go one step at a time :)

Comment on lines 270 to 304
// Early returns for empty or invalid cases
if iovcnt == 0 {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this justified by the docs? Also, what does the "or invalid" mean here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • I've updated the documentation to clarify the behavior of readv when iovcnt is 0. The updated documentation now explicitly covers:

    1. The function's return value semantics based on the POSIX specification
    2. The meaning of various error conditions and their corresponding return codes
    3. Special cases like empty buffer lists (iovcnt = 0)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't asking about your documentation, I was asking about the official documentation for readv. Often in cases like this we quite a sentence from the docs, and link to the docs, to explain why particular corner cases are handled the way they are.

trace!("readv: FD not found");
return this.set_last_error_and_return(LibcError("EBADF"), dest);
};
trace!("readv: FD mapped to {fd:?}");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove the debug tracing from the PR for review.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

// We need temporary storage for each individual read operation's result
// Using an intermediate buffer helps handle error conditions cleanly
// We use i128 to safely handle both success (positive) and error (-1) cases
let read_dest = this.allocate(this.machine.layouts.i128, MiriMemoryKind::Machine.into())?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why you are allocating new storage here. What I would expect is that you call read once for each element of the vector, and directly put the results into the user-provided buffer. Yes that means we are using more than one syscall, but we don't really care about such details here -- the only thing that matters is the end-to-end behavior.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah it seems the docs say that the entire read must be atomic. Please fix the comment to explain why you are allocating a buffer here; currently you say something about error conditions but it's not clear what this means and it misses the main point.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've cleaned up the code and removed the problematic logic that attempted to handle reads individually.

The new implementation will focus on ensuring atomicity by performing a single read operation into a contiguous buffer, as required by POSIX readv() semantics.

let mut current_offset = offset;

// Process each iovec structure
for i in 0..iovcnt {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't use a loop of reads, the read must be atomic. I thought that's why you are allocating a new buffer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've cleaned up the code and removed the problematic logic that attempted to handle reads individually.

The new implementation will focus on ensuring atomicity by performing a single read operation into a contiguous buffer, as required by POSIX readv() semantics.

this.read_scalar(&read_dest)?.to_i128()?
} else {
// Handle regular read case
fd.read(&fd, this.machine.communicate(), iov_base_ptr, iov_len, &read_dest, this)?;
Copy link
Member

@RalfJung RalfJung Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This unfortunately does not work if read blocks. The code after read will run too early, before the read actually completed.

Implementing this properly for blocking read is very complicated and will require some non-trivial refactoring of the file description trait.

Copy link
Contributor Author

@shamb0 shamb0 Dec 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for highlighting the critical issue regarding blocking reads and concurrent access. This was an oversight in my initial implementation that could have led to race conditions.

I've refined the implementation, with a focus on atomic operations. The key improvement is the introduction of a new read_buffer interface in the FileDescription trait:

impl FileDescription for FileHandle {
    fn read_buffer<'tcx>(
        &self,
        self_ref: &FileDescriptionRef,
        communicate_allowed: bool,
        buf: &mut [u8],
        dest: &MPlaceTy<'tcx>,
        ecx: &mut MiriInterpCx<'tcx>,
    ) -> InterpResult<'tcx> {
        // Implementation details
    }
}

The revised strategy for readv() now follows these steps:

  1. Allocates an intermediate byte buffer for the entire read operation
  2. Performs a single atomic read operation into this buffer
  3. Distributes the data from the intermediate buffer to the user's scattered iovec buffers

This approach ensures thread safety while maintaining POSIX compliance for atomic vectored I/O operations. I would appreciate your thoughts on this implementation and welcome any suggestions for further improvements.

Copy link
Member

@RalfJung RalfJung Dec 21, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is in fact pretty much exactly how FileDescription::read used to look, but we changed it to support blocking reads. Adding it back will require all FDs to implement read twice, which we should avoid.

The problem with read is that when it returns, the read may not actually have completed yet. The fix for that is known as "continuation passing style": we need to equip read with a callback that it will invoke when the read was actually completed. That way, all the logic for taking the data from the big buffer and distributing it over the vectored buffers can be done at the right time, and we don't have to duplicate so much of the logic.

For this we should do a bit of preparation, in a separate PR: we currently have the UnblockCallback type for callbacks to be invoked on unblocking; we will have to generalize this to a general MachineCallback so that we can also use it in read. UnblockCallback has two methods, unblock and timeout, but in general we want only one method. The MachineCallback type should be generic over the argument type of that method. Thread unblocking can then use an argument type that indicates whether this is a successful unblock or a timeout.

So I would suggest a series of 3 PRs:

  1. Generalize UnblockCallback to MachineCallback (will require fixes everywhere UnblockCallback is currently used, and adjustments to the callback! macro)
  2. Equip FileDescription::read with a callback that is invoked after completion of the read.
  3. Add readv, using that callback.

@RalfJung
Copy link
Member

You have CI failures. Please make sure CI is green before asking for review; that avoids reviewers having to spend time on things that CI can check automatically. If you don't understand why CI fails, ask for help.

@RalfJung RalfJung added S-waiting-on-author Status: Waiting for the PR author to address review comments and removed S-waiting-on-review Status: Waiting for a review to complete labels Dec 18, 2024
@shamb0 shamb0 force-pushed the support-libc-readv-writev branch 3 times, most recently from b99627c to d1e7023 Compare December 20, 2024 07:19
@shamb0
Copy link
Contributor Author

shamb0 commented Dec 20, 2024

You have CI failures. Please make sure CI is green before asking for review; that avoids reviewers having to spend time on things that CI can check automatically. If you don't understand why CI fails, ask for help.

You're absolutely correct about the importance of ensuring all CI checks pass before requesting a review. Moving forward, I will:

  • Wait for CI builds to successfully complete across all supported HOST_TARGET configurations.
  • Use rustbot ready only after confirming a green CI status.

@shamb0 shamb0 force-pushed the support-libc-readv-writev branch 2 times, most recently from 102f366 to 222bac2 Compare December 21, 2024 06:21
@RalfJung
Copy link
Member

RalfJung commented Dec 21, 2024

That CI failure indicates that you need to rebase your branch over the latest Miri master branch, and then run ./miri toolchain. After that, a ./miri check should reproduce the issue on your machine so that you can fix it.

@shamb0 shamb0 force-pushed the support-libc-readv-writev branch 2 times, most recently from 4fe0a6d to 2033a3e Compare December 21, 2024 09:47
@shamb0 shamb0 force-pushed the support-libc-readv-writev branch 3 times, most recently from 0b30437 to 4f66570 Compare December 21, 2024 13:50
- This enables vectorized reads.

Signed-off-by: shamb0 <[email protected]>
@shamb0 shamb0 force-pushed the support-libc-readv-writev branch from 4f66570 to 3e6554f Compare December 21, 2024 14:26
@shamb0
Copy link
Contributor Author

shamb0 commented Dec 22, 2024

Hi @RalfJung,

Thank you for the detailed design proposal outlining the three-phase implementation approach. Converting UnblockCallback to a more generic MachineCallback and improving read operation handling through continuation passing style is a robust solution.

While the current PR has addressed previous review comments with passing CI builds, your new design direction offers a better architectural path. I'd appreciate your guidance on whether to:

  1. Start fresh with a new PR series implementing the three-phase approach, or
  2. Evolve this PR to incorporate the suggested architectural changes

Looking forward to your direction on the preferred approach.

@shamb0
Copy link
Contributor Author

shamb0 commented Dec 22, 2024

@rustbot ready

@rustbot rustbot added S-waiting-on-review Status: Waiting for a review to complete and removed S-waiting-on-author Status: Waiting for the PR author to address review comments labels Dec 22, 2024
@RalfJung
Copy link
Member

I would suggest you convert this PR to a draft, and start working on the first and second PR outlined in my plan. Once they both landed, this PR can serve as the basis for step 3.

@RalfJung RalfJung added S-blocked Status: blocked on something happening somewhere else and removed S-waiting-on-review Status: Waiting for a review to complete labels Dec 22, 2024
@shamb0
Copy link
Contributor Author

shamb0 commented Dec 22, 2024

I would suggest you convert this PR to a draft, and start working on the first and second PR outlined in my plan. Once they both landed, this PR can serve as the basis for step 3.

Sounds Good, Thankyou :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
S-blocked Status: blocked on something happening somewhere else
Projects
None yet
Development

Successfully merging this pull request may close these issues.

unix: add support for libc::readv and libc::writev
3 participants