Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

rust: hrtimer: introduce hrtimer support #1061

Draft
wants to merge 2 commits into
base: staging/rust-pci
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
283 changes: 283 additions & 0 deletions rust/kernel/hrtimer.rs
Original file line number Diff line number Diff line change
@@ -0,0 +1,283 @@
// SPDX-License-Identifier: GPL-2.0

//! Intrusive high resolution timers.
//!
//! Allows scheduling timer callbacks without doing allocations at the time of
//! scheduling. For now, only one timer per type is allowed.
//!
//! # Example
//!
//! ```rust
//! use kernel::{
//! sync::Arc, hrtimer::{RawTimer, Timer, TimerCallback},
//! impl_has_timer, prelude::*, stack_pin_init
//! };
//! use core::sync::atomic::AtomicBool;
//! use core::sync::atomic::Ordering;
//!
//! #[pin_data]
//! struct IntrusiveTimer {
//! #[pin]
//! timer: Timer<Self>,
//! flag: AtomicBool,
//! }
//!
//! impl IntrusiveTimer {
//! fn new() -> impl PinInit<Self> {
//! pin_init!(Self {
//! timer <- Timer::new(),
//! flag: AtomicBool::new(false),
//! })
//! }
//! }
//!
//! impl TimerCallback for IntrusiveTimer {
//! type Receiver = Arc<IntrusiveTimer>;
//!
//! fn run(this: Self::Receiver) {
//! pr_info!("Timer called\n");
//! this.flag.store(true, Ordering::Relaxed);
//! }
//! }
//!
//! impl_has_timer! {
//! impl HasTimer<Self> for IntrusiveTimer { self.timer }
//! }
//!
//! let has_timer = Arc::pin_init(IntrusiveTimer::new())?;
//! has_timer.clone().schedule(200_000_000);
//! while !has_timer.flag.load(Ordering::Relaxed) { core::hint::spin_loop() }
//!
//! pr_info!("Flag raised\n");
//!
//! # Ok::<(), kernel::error::Error>(())
//! ```
//!
//! C header: [`include/linux/hrtimer.h`](srctree/include/linux/hrtimer.h)

use core::{marker::PhantomData, pin::Pin};

use crate::{init::PinInit, prelude::*, sync::Arc, types::Opaque};

/// A timer backed by a C `struct hrtimer`
///
/// # Invariants
///
/// * `self.timer` is initialized by `bindings::hrtimer_init`.
#[repr(transparent)]
#[pin_data(PinnedDrop)]
pub struct Timer<T> {
#[pin]
timer: Opaque<bindings::hrtimer>,
_t: PhantomData<T>,
}

// SAFETY: A `Timer` can be moved to other threads and used from there.
unsafe impl<T> Send for Timer<T> {}

// SAFETY: Timer operations are locked on C side, so it is safe to operate on a
// timer from multiple threads
unsafe impl<T> Sync for Timer<T> {}

impl<T: TimerCallback> Timer<T> {
/// Return an initializer for a new timer instance.
pub fn new() -> impl PinInit<Self> {
crate::pin_init!( Self {
timer <- Opaque::ffi_init(move |place: *mut bindings::hrtimer| {
// SAFETY: By design of `pin_init!`, `place` is a pointer live
// allocation. hrtimer_init will initialize `place` and does not
// require `place` to be initialized prior to the call.
unsafe {
bindings::hrtimer_init(
place,
bindings::CLOCK_MONOTONIC as i32,
bindings::hrtimer_mode_HRTIMER_MODE_REL,
);
}

// SAFETY: `place` is pointing to a live allocation, so the deref
// is safe. The `function` field might not be initialized, but
// `addr_of_mut` does not create a reference to the field.
let function: *mut Option<_> = unsafe { core::ptr::addr_of_mut!((*place).function) };

// SAFETY: `function` points to a valid allocation.
unsafe { core::ptr::write(function, Some(T::Receiver::run)) };
metaspace marked this conversation as resolved.
Show resolved Hide resolved
}),
_t: PhantomData,
})
}
}

#[pinned_drop]
impl<T> PinnedDrop for Timer<T> {
fn drop(self: Pin<&mut Self>) {
// SAFETY: By struct invariant `self.timer` was initialized by
// `hrtimer_init` so by C API contract it is safe to call
// `hrtimer_cancel`.
unsafe {
bindings::hrtimer_cancel(self.timer.get());
}
}
}
Comment on lines +111 to +121
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the workqueue, ownership makes it impossible to run the destructor while the item is registered with a workqueue. Is that not the same here?

I'm concerned about this because if this is not the first field (in declaration order! not memory layout order), then other fields may already have been dropped, so by the time we call this method, triggering the timer is already a UAF.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you are right. The timer handler will hold a reference to the struct that is embedding the Timer, and if the Timer is not the first field, some of the fields could be dropped while the handler is holding this reference.

I'll try to think of a solution, thanks.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If being the first field is a solution, impl_has_timer could become an attribute macro that also adds this field. That's not foolproof and adds proc macro complexity.

Maybe impl_has_timer could add a Drop impl for whatever type it wraps?

This kind of comes back to the container_of problem.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for using Pin<&T> to point to the target is that I have the
following arrangement:

In blk-mq, requests are handed off to the driver as a pointer to a struct request. The first part of the pointee is the struct request and immediately
after that is a driver private area that the driver can use to store things:

 +-----------------------+
 |                       |
 |                       |
 |     Request           |
 |                       |
 |                       |
 +-----------------------+
 |                       |
 |    Private            |
 |                       |
 |                       |
 +-----------------------+

The memory for these is pre allocated and initialized before the driver starts
accepting requests. The null block driver keeps the following in the private
area:

#[pin_data]
struct Pdu {
    #[pin]
    timer: kernel::hrtimer::Timer<Self>,
}

When the driver is handling a request, it will obtain a reference to this place:
Pin<&Pdu> that is guaranteed to be valid until the request is completed.

There are three options I think:

  1. Use an owned pointer such as Box, Arc, Aref.
  2. Somehow implement drop on the container struct, for instance with a proc macro.
  3. Use a custom pointer type for this special use case - not sure how yet.

I would rather not go with 1 for my use case, because that would mean an extra
layer of indirection. I would have to store an Arc in my private area of the
request structure. For other uses it might be fine and the hrtimer
abstractions would work as they are. Just don't implement the pointer traits for
references.

For 2, we would loose the ability to eventually embed more than one timer in a
structure. I would like to keep the option to extend this patch like workqueue
to use an ID to distinguish between fields. Also, it might interfere with user
drop impls?

3 would mean unsafe code in the driver, and I am not sure how to exactly do it
yet. But maybe it could be the way.

Any input greatly appreciated :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If being the first field is a solution, impl_has_timer could become an attribute macro that also adds this field. That's not foolproof and adds proc macro complexity.

Maybe impl_has_timer could add a Drop impl for whatever type it wraps?

This kind of comes back to the container_of problem.

I think that it is not sufficient, if we want to support custom drop implementations, since one could have an invariant on the struct that is violated after drop is called, then when dropping even the first field, the invariant is violated.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any input greatly appreciated :)

I think I have an idea, but it is rather complex and needs a lot more thought:

  • have a Deconstruct trait with fn deconstruct(&self) that is called at most once (done by making it unsafe) and guaranteed to be called before any drop functions from objects higher in the "contains" hierarchy are called
  • call bindings::hrtimer_cancel inside of Timer::deconstruct
  • somehow only allow constructions of Timer where the above invariant can be upheld

we might need a new smart pointer for this.

This only works if the request struct that you mentioned is declared on the rust side though.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This only works if the request struct that you mentioned is declared on the rust side though.

It's a C side structure that is initialized and destroyed from C. In this case, the private part is initialized and destroyed by calling into a vtable where Rust pointers are installed.

The following is not directly related to this PR, but it clarifies my rationale for wanting to implement the timer trait on references. I think this whole situation actually comes from me trying to achieve something that is not possible. struct request is actually reference counted. But when the requests are issued to the drivers, the block layer holds a reference to the request, and thus they are guaranteed to not disappear until they are completed. So for performance reasons C drivers do not take out a ref count on the structure. They just promise to only access requests that are in flight (not completed). To do this they must trust drivers to never deliver invalid completions, because at some point in the process a driver will have to turn a completion id into a request reference. If the hardware delivers an invalid completion id, you would get a reference to the wrong request and possibly have a data race.

In Rust this is a problem because we have to provide a function that turns an integer into a request reference (for getting to the request in interrupt context when hardware reports the operation done). If you provide an invalid integer (tag) here, you would get a reference that violate invariants for shared references. In Rust it is not enough to say "we trust the driver to deliver correct completion ids", because it is still possible to write safe code that has UB by simply asking for a reference for a tag that you should not be able to get.

So, I either have to make that particular function (turning an id into a request reference) unsafe, and each driver must suffer a line of unsafe code to call this, or I would have to take a reference on the request to make sure it cannot go away. The latter has the side effect of making this timer issue disappear as well, but it might hurt performance.

In the interest of making this patch less convoluted, I think I will benchmark the cost of taking out a ref count on the request and see if that works out. Then we can skip this dance and implement the timer pointer trait for a reference counted smart pointer, probably ARef.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I solved this by using an owned reference for my use case.


/// Implemented by pointer types to structs that embed a [`Timer`]. This trait
/// facilitates queueing the timer through the pointer that implements the
/// trait.
///
/// Typical implementers would be [`Box<T>`], [`Arc<T>`], [`ARef<T>`] where `T`
/// has a field of type `Timer`.
///
/// Target must be [`Sync`] because timer callbacks happen in another thread of
/// execution.
///
/// [`Box<T>`]: Box
/// [`Arc<T>`]: Arc
/// [`ARef<T>`]: crate::types::ARef
pub trait RawTimer: Sync {
/// Schedule the timer after `expires` time units
fn schedule(self, expires: u64);
}

/// Implemented by structs that contain timer nodes.
///
/// Clients of the timer API would usually safely implement this trait by using
/// the [`impl_has_timer`] macro.
///
/// # Safety
///
/// Implementers of this trait must ensure that the implementer has a [`Timer`]
/// field at the offset specified by `OFFSET` and that all trait methods are
/// implemented according to their documentation.
///
/// [`impl_has_timer`]: crate::impl_has_timer
pub unsafe trait HasTimer<T> {
/// Offset of the [`Timer`] field within `Self`
const OFFSET: usize;

/// Return a pointer to the [`Timer`] within `Self`.
///
/// # Safety
///
/// `ptr` must point to a valid struct of type `Self`.
unsafe fn raw_get_timer(ptr: *const Self) -> *const Timer<T> {
// SAFETY: By the safety requirement of this trait, the trait
// implementor will have a `Timer` field at the specified offset.
unsafe { ptr.cast::<u8>().add(Self::OFFSET).cast::<Timer<T>>() }
}

/// Return a pointer to the struct that is embedding the [`Timer`] pointed
/// to by `ptr`.
///
/// # Safety
///
/// `ptr` must point to a [`Timer<T>`] field in a struct of type `Self`.
unsafe fn timer_container_of(ptr: *mut Timer<T>) -> *mut Self
where
Self: Sized,
{
// SAFETY: By the safety requirement of this trait, the trait
// implementor will have a `Timer` field at the specified offset.
unsafe { ptr.cast::<u8>().sub(Self::OFFSET).cast::<Self>() }
}
}

/// Implemented by pointer types that can be the target of a C timer callback.
pub trait RawTimerCallback: RawTimer {
/// Callback to be called from C.
///
/// # Safety
///
/// Only to be called by C code in `hrtimer`subsystem.
unsafe extern "C" fn run(ptr: *mut bindings::hrtimer) -> bindings::hrtimer_restart;
}

/// Implemented by pointers to structs that can the target of a timer callback
pub trait TimerCallback {
/// Type of `this` argument for `run()`.
type Receiver: RawTimerCallback;

/// Called by the timer logic when the timer fires
fn run(this: Self::Receiver);
}

impl<T> RawTimer for Arc<T>
where
T: Send + Sync,
T: HasTimer<T>,
{
fn schedule(self, expires: u64) {
let self_ptr = Arc::into_raw(self);

// SAFETY: `self_ptr` is a valid pointer to a `T`
let timer_ptr = unsafe { T::raw_get_timer(self_ptr) };

// `Timer` is `repr(transparent)`
let c_timer_ptr = timer_ptr.cast::<bindings::hrtimer>();

// Schedule the timer - if it is already scheduled it is removed and
// inserted

// SAFETY: c_timer_ptr points to a valid hrtimer instance that was
// initialized by `hrtimer_init`
unsafe {
bindings::hrtimer_start_range_ns(
c_timer_ptr.cast_mut(),
expires as i64,
0,
bindings::hrtimer_mode_HRTIMER_MODE_REL,
);
}
}
}

impl<T> kernel::hrtimer::RawTimerCallback for Arc<T>
where
T: Send + Sync,
T: HasTimer<T>,
T: TimerCallback<Receiver = Self>,
{
unsafe extern "C" fn run(ptr: *mut bindings::hrtimer) -> bindings::hrtimer_restart {
// `Timer` is `repr(transparent)`
let timer_ptr = ptr.cast::<kernel::hrtimer::Timer<T>>();

// SAFETY: By C API contract `ptr` is the pointer we passed when
// enqueing the timer, so it is a `Timer<T>` embedded in a `T`
let data_ptr = unsafe { T::timer_container_of(timer_ptr) };

// SAFETY: This `Arc` comes from a call to `Arc::into_raw()`
let receiver = unsafe { Arc::from_raw(data_ptr) };

T::run(receiver);

bindings::hrtimer_restart_HRTIMER_NORESTART
}
}

/// Use to implement the [`HasTimer<T>`] trait.
///
/// See [`module`] documentation for an example.
///
/// [`module`]: crate::hrtimer
#[macro_export]
macro_rules! impl_has_timer {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A small example here would be good, or link to the module docs for more information (since the macro will show up in the top-level docs)

($(impl$(<$($implarg:ident),*>)?
HasTimer<$timer_type:ty $(, $id:tt)?>
for $self:ident $(<$($selfarg:ident),*>)?
{ self.$field:ident }
)*) => {$(
// SAFETY: This implementation of `raw_get_timer` only compiles if the
// field has the right type.
unsafe impl$(<$($implarg),*>)? $crate::hrtimer::HasTimer<$timer_type> for $self $(<$($selfarg),*>)? {
const OFFSET: usize = ::core::mem::offset_of!(Self, $field) as usize;

#[inline]
unsafe fn raw_get_timer(ptr: *const Self) -> *const $crate::hrtimer::Timer<$timer_type $(, $id)?> {
// SAFETY: The caller promises that the pointer is not dangling.
unsafe {
::core::ptr::addr_of!((*ptr).$field)
}
}

}
)*};
}
1 change: 1 addition & 0 deletions rust/kernel/lib.rs
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ extern crate self as kernel;
mod allocator;
mod build_assert;
pub mod error;
pub mod hrtimer;
pub mod init;
pub mod ioctl;
#[cfg(CONFIG_KUNIT)]
Expand Down