Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No bounded version of ws_deque and mpsc_queue? #52

Open
UnixJunkie opened this issue Jan 6, 2023 · 9 comments
Open

No bounded version of ws_deque and mpsc_queue? #52

UnixJunkie opened this issue Jan 6, 2023 · 9 comments

Comments

@UnixJunkie
Copy link

In a long-running program, things should not be able to grow indefinitely.
It would be nice if those data structures have a bounded counterpart.
I.e. you create the data structure with a maximum size parameter N (the data structure can contain
up to N elements maximum).
Once N is reached, all operations which try to grow the data structure should block until some space is available.
UNIX pipes have such a behavior IIRC.
Not only it prevents things from going out of hand, but it also allows some simple/natural synchronization
between parallel workers.

@lyrm
Copy link
Collaborator

lyrm commented Jan 11, 2023

Your point is valid and yes, we don't have these kind of data structures right now. We could propose an alternate version of ws_deque (see below) but for mpsc_queue with the current algorithm, trying to have a bounded size will significantly impact the performances as it would require to add an size atomic variable that is accessed for read and write at every push and pop, creating a lot of contention. That will significantly diminish the performances of the queue.

For the ws_deque, we may be able to provide a less inefficient solution, as the size of the structure is easily computable, We could than have the push operation raise an exception if the limit is reached. No blocking operation, however, since we want our function to be lock free.

@UnixJunkie
Copy link
Author

Blocking is not necessarily bad, it can be a mean of synchronization.
Example: single writer, many readers (parallel workers).
The writer blocking to write means all workers are busy and the queue is full.
If the queue was initially sized at 50*|workers|, there is no performance problem in blocking the writer.
While it is a good idea: it prevents from filling up memory when creating jobs is very fast
but completing one is computationally heavy.

@UnixJunkie
Copy link
Author

Note that I am aware of Domainslib.Chan, which have a make_bounded constructor.
So that's what I am currently using in parany: https://github.com/UnixJunkie/parany/blob/master/src/parany.ml

@lyrm
Copy link
Collaborator

lyrm commented Jan 12, 2023

I am not telling that blocking is bad but this lib is called lockfree and so it provides functions that are, by definition, non-blocking.

It is then the responsibility of the user to decide to block or add lock when they see fit. In this case, it would be very easy to do so by properly catching the exception. Also, it gives the user the possibility to act however they want when it happens.

@lyrm
Copy link
Collaborator

lyrm commented Jan 12, 2023

But I am guessing what you want is a passive wait. I will try to spend some times on it next week and see if I can propose something that will work without costing to much (at least for the ws_deque).

@UnixJunkie
Copy link
Author

This "lockfree" word is very misleading. I understand people are using hardware-supported instructions, I guess in the CPU then there is some kind of semaphore.

@UnixJunkie
Copy link
Author

Yes, I have one use-case where the only writer wants to passive-wait if the data structure is full.
This is a one writer to many readers communication scheme.

I have another use-case where the only reader wants to passive-wait if the data structure is empty.
This is a many writers single-reader communication scheme.

@Sudha247
Copy link
Collaborator

Sudha247 commented Jan 13, 2023

This "lockfree" word is very misleading

I don't agree that it's misleading. Lockfree programming has been around for a while and practised by folks doing concurrent programming even in other programming languages supporting parallelism. The Wikipedia page summarises the definition well. We've also linked the relevant literature and source papers for the data structures in this library, wherever applicable. That said, we could perhaps make this clear in our documentation.

Blocking is not necessarily bad, it can be a mean of synchronization.

You're right in that many a times one could have an easy and resonably performant solution with blocking algorithms. As @lyrm mentioned above, it's outside the scope of this library to host blocking structures. A better place for this is perhaps domainslib, or a new library for concurrent data structures under ocaml-multicore.

@polytypic
Copy link
Contributor

I developed a draft implementation of lock-free bounded queue, see #83. It is based on the Michael-Scott queue and keeps track of the length of the queue and a cache of the remaining capacity and avoids additional contention in the happy paths. Unless the capacity is made really small, performance should be good.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants