Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Portable Executor port to Rust #532

Open
wants to merge 15 commits into
base: main
Choose a base branch
from
Open

Portable Executor port to Rust #532

wants to merge 15 commits into from

Conversation

nickpdemarco
Copy link
Member

This PR reimplements stlab::priority_task_system in Rust.

Usage

The rust port is chosen with a cmake argument, as in:

cmake -S . -B ../BUILD -GNinja -DCMAKE_CXX_STANDARD=17 -DCMAKE_BUILD_TYPE=Release -DSTLAB_TASK_SYSTEM=experimental_rust

The build will require an installation of cargo; otherwise, it should work out of the box.

To test this implementation, build with cmake as above, and then run ctest from the build directory.

Implementation

For greater readability and easier-to-prove safety, the C++ implementation has been broken into four components:

DropJoinThreadPool

Like scoped_threadpool, but the scope is tied to an object's lifetime, rather than a function call.

CoarsePriorityQueue

A priority queue with three priorities.

NotificationQueue

A threadsafe CoarsePriorityQueue.

Waiter

A faithful copy of stlab::detail::waiter. It may be prudent to rewrite this in terms of the Rust-native thread::park

PriorityTaskSystem

Building on the above components, this struct contains the algorithm essential to the original C++ implementation. Namely (quoting the inline documentation):

/// A portable work-stealing task scheduler with three priorities.
///
/// This scheduler spins up a number of threads corresponding to the amount of parallelism available
/// on the target platform, namely, std::thread::available_parallelism() - 1. Each thread is
/// assigned a threadsafe priority queue. To reduce contention on push and pop operations, a thread 
/// will first attempt to acquire the lock for its own queue without blocking.
/// If that fails, it will attempt the same non-blocking push/pop for each other priority queue in
/// the scheduler. Finally, if each of those attempts also fail, the thread will attempt a blocking
/// push/pop on its own priority queue.
///
/// The `add_thread` API is intended to mitigate the possibility of deadlock by spinning up a new
/// worker thread that non-blockingly polls all of the system's priority queues, and then sleeps
/// until `wake()` is called.

@sean-parent
Copy link
Member

The usage instructions should be added to the readme file.

/// @tparam F function object type
/// @param f a function object.
/// @return the result of calling `execute`.
template <class F, typename = std::enable_if_t<std::is_invocable_r<void, F>::value>>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use is_nothrow_invokable_v<F> and you can eliminate the try/catch.


namespace rust {

/// @brief Asynchronously invoke f on the rust-backed executor.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our documentation system is Hyde based - Although Hyde 2.0 has some support for doxygen style comments I'm not sure how this will play. The Hyde 2.0 work is currently in my fork https://github.com/sean-parent/stlab-libraries.

@@ -91,6 +91,10 @@ if (STLAB_TASK_SYSTEM STREQUAL "libdispatch")
target_link_libraries(stlab INTERFACE libdispatch::libdispatch)
endif()

if (STLAB_TASK_SYSTEM STREQUAL "experimental_rust")
add_subdirectory( rustport )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we maybe call this directory just rust? I'm not sure to what extent something is a port once the port is complete :)

@@ -24,7 +24,7 @@ stlab_detect_thread_system(STLAB_DEFAULT_THREAD_SYSTEM)
set( STLAB_THREAD_SYSTEM ${STLAB_DEFAULT_THREAD_SYSTEM} CACHE STRING "Thread system to use (win32|pthread|pthread-emscripten|pthread-apple|none)")

stlab_detect_task_system(STLAB_DEFAULT_TASK_SYSTEM)
set(STLAB_TASK_SYSTEM ${STLAB_DEFAULT_TASK_SYSTEM} CACHE STRING "Task system to use (portable|libdispatch|windows).")
set(STLAB_TASK_SYSTEM ${STLAB_DEFAULT_TASK_SYSTEM} CACHE STRING "Task system to use (portable|libdispatch|windows|experimental_rust).")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this whole library is experimental I wonder if we should call this task system portable_rust.

@@ -99,6 +103,7 @@ elseif (STLAB_MAIN_EXECUTOR STREQUAL "qt6")
target_link_libraries( stlab INTERFACE Qt6::Core )
endif()

message(STATUS "stlab: Use Boost C++17 Shims: ${STLAB_USE_BOOST_CPP17_SHIMS}")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Leftover cruft? I think this variable was removed at one point.

@@ -123,6 +128,10 @@ if ( BUILD_TESTING )
stlab::development
stlab::stlab )

if (STLAB_TASK_SYSTEM STREQUAL "experimental_rust")
target_link_libraries( testing INTERFACE RustyDefaultExecutor )
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm confused that this is linked to testing instead of it being transitively brought in with stlab::stlab

HEADER_NAME "bindings.h"
)

add_library(RustyDefaultExecutor INTERFACE)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe RustPortableDefaultExecutor?

[lib]
crate-type = ["staticlib"]

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mabe remove the above line?

# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html

[dependencies]
once_cell = "1.18.0"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd love to understand why this is used. I tried looking at the once_cell docs but don't remember the difference between &T and Ref

#include <type_traits>
#include <utility>

namespace rust {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this maybe live in the stlab namespace somewhere?

@@ -0,0 +1,40 @@
#include "bindings.h"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the code in this file belong in stlab/concurrency/default_executor.hpp?

/// `ThreadsafeCFnWrapper` may not rely on thread-local state.
unsafe impl Send for ThreadsafeCFnWrapper {}

/// Enqueues a the execution of `f(context)` on the PriorityTaskSystem.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why doesn't this function return void?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And until it does return void, you need to describe the return value.


namespace rust {

/// @brief Asynchronously invoke f on the rust-backed executor.
Copy link
Contributor

@dabrahams dabrahams Dec 8, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// @brief Asynchronously invoke f on the rust-backed executor.
/// @brief Asynchronously invokes f on the rust-backed executor.

Descriptive, not prescriptive. But you also need to describe the result.

});
}

/// @brief Asynchronously invoke `f` on the rust-backed executor with priority `p`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// @brief Asynchronously invoke `f` on the rust-backed executor with priority `p`.
/// @brief Asynchronously invokes `f` on the rust-backed executor with priority `p`.

/// @param priority
/// @return the value returned by `execute_priority`.
template <class F, class = std::enable_if_t<std::is_invocable_r< void, F >::value> >
auto enqueue_priority(F f, Priority p) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name reads like something that is enqueueing a priority.

Suggested change
auto enqueue_priority(F f, Priority p) {
auto enqueue_at_priority(F f, Priority p) {

/// @tparam F function object type
/// @param f a function object.
/// @param priority
/// @return the value returned by `execute_priority`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is execute_priority?

IMO all these @-items are—or should be—redundant with the summary or signature you've written and should be omitted. I would only write these kinds of clauses when the summary and the signature can't say everything that needs to be said. You need to add a , returning _______ clause to the summary but I don't know what to suggest because of the opening question of this comment ;-)

use stlab::{Priority, PriorityTaskSystem};
mod stlab;

/// A static instance of the task system which is invoked through the `execute` functions below.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// A static instance of the task system which is invoked through the `execute` functions below.
/// A static instance of the task system that is invoked through the `execute` functions below.

s/which/that/; which is for after a comma and would make this mean the"the task system" (not the instance) is invoked.

Suggested change
/// A static instance of the task system which is invoked through the `execute` functions below.
/// An instance of the task system that is invoked through the `execute` functions below.

Don't repeat stuff like static that is obvious from the declaration, which must appear in the generated documentation.

Suggested change
/// A static instance of the task system which is invoked through the `execute` functions below.
/// A task system that is invoked through the `execute` functions below.

"An instance" is almost always vacuous.

Is this not “the” task system, or is it possible and useful to make another instance?

Suggested change
/// A static instance of the task system which is invoked through the `execute` functions below.
/// The task system.

or, if this is really internal to the module, you might go with:

Suggested change
/// A static instance of the task system which is invoked through the `execute` functions below.
/// The shared state of the task system.

mod stlab;

/// A static instance of the task system which is invoked through the `execute` functions below.
static TASK_SYSTEM: Lazy<Mutex<PriorityTaskSystem>> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be made const?


/// A static instance of the task system which is invoked through the `execute` functions below.
static TASK_SYSTEM: Lazy<Mutex<PriorityTaskSystem>> =
Lazy::new(|| Mutex::new(PriorityTaskSystem::new()));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wow, unary ||. Can you explain what that means?

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a lambda taking no arguments

Lazy::new(|| Mutex::new(PriorityTaskSystem::new()));


/// A function pointer paired with a context, akin to a C++ lambda and its captures.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a feeling that there's an abstraction here that you're obscuring with both the name and the doc. Why isn't this just, e.g., ThreadsafeLambda? Is it OK if the wrapped thing is threadsafe but impure? Just askin'.


/// Enqueues a the execution of `f(context)` on the PriorityTaskSystem.
///
/// Precondition: Neither `context` nor `fn_ptr` may rely on thread-local state.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
/// Precondition: Neither `context` nor `fn_ptr` may rely on thread-local state.
/// Precondition: Neither `context` nor `fn_ptr` relies on thread-local state.

But it's very unclear exactly exactly what that might mean. I can roughly guess for fn_ptr (the doc should be more precise) but for context it is a true mystery. What does it mean for a void pointer to rely on anything?

0
}

/// Enqueues a the execution of `f(context)` on the PriorityTaskSystem at the given `priority`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to describe the return value.

use std::cmp::Ordering;

/// A priority queue with three priorities. Elements with equal priorities are popped in FIFO order.
pub struct CoarsePriorityQueue<T> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
pub struct CoarsePriorityQueue<T> {
pub struct Stable3PriorityQueue<T> {

@@ -0,0 +1,120 @@
use std::cmp::Ordering;

/// A priority queue with three priorities. Elements with equal priorities are popped in FIFO order.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To me it feels like this description is a little backwards, and it's a FIFO queue with 3 priorities. 🤷‍♂️ I realize they are technically equivalent.

}

unsafe impl Send for Waiter {}
unsafe impl Sync for Waiter {}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should be unnecessary (Send and Sync are auto traits)

threads: Vec<JoinHandle<()>>,
// SAFETY: We store this pointer as a `*mut` so it can be dropped via Box::from_raw, but
// we only ever hand out immutable references to the pointee.
data: *mut T,
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the usage of this is safe as currently done. Note though when you create a reference from a pointer, you will infer whatever lifetime you want/need to that reference. This can cause use after free: https://play.rust-lang.org/?version=stable&mode=debug&edition=2021&gist=c117639943042e370cc9c4ea225e8412

Storing, and also sharing an Arc would make this a non-issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants