Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-linearizable behavior in cancel + awaitClose inside of produce #4149

Open
globsterg opened this issue Jun 4, 2024 · 1 comment
Open
Labels

Comments

@globsterg
Copy link
Contributor

Describe the bug

The cancel operation on the channel returned by the produce function, when used together with awaitClose, leads to a data race:

  • If cancel only manages to cancel the channel, awaitClose returns normally.
  • If cancel also cancels the job of the produce coroutine, awaitClose throws a CancellationException.

In most cases, the code after awaitClose inside produce will not execute if the channel gets cancelled, and it's possible that someone could start relying on this behavior, even though it is not guaranteed.

It looks like cancelling the coroutine first and cancelling the channel later inside cancel may fix this particular bug, but I don't understand if the current order of operations, too, has its upsides.

I do not know if this actually affects anyone, I discovered this analytically while working on #4148

Provide a Reproducer

In the current develop branch, add this to ProduceTest.kt:

@Test
fun produceAwaitCloseStressTest() = runTest {
    repeat(100) {
        coroutineScope {
            val c = produce<Int>(Dispatchers.Default) {
                try {
                    awaitClose()
                    println("Normal exit")
                } catch (e: Exception) {
                    println("Exception $e")
                    throw e
                }
            }
            launch(Dispatchers.Default) {
                c.cancel()
            }
        }
    }
}

I get both exceptions and (much more rarely) normal exits reported when I run this.

@globsterg globsterg added the bug label Jun 4, 2024
globsterg added a commit to globsterg/kotlinx.coroutines that referenced this issue Jun 4, 2024
@fzhinkin
Copy link

fzhinkin commented Jun 6, 2024

@globsterg, thanks for reporting the issue!

In most cases, the code after awaitClose inside produce will not execute if the channel gets cancelled, and it's possible that someone could start relying on this behavior, even though it is not guaranteed.

Indeed, docs state that no code after canceled awaitClose will be executed:

Therefore, in case of cancellation, no code after the call to this function will be executed.

dkhalanskyjb pushed a commit that referenced this issue Jul 30, 2024
* Reword the BufferOverflow KDoc for consistency in the entry list

Before, the description of `SUSPEND` was phrased in terms of what
will happen, while the rest of the entries were described in an
imperative form, that is, as commands as to what should happen.
Now, all entries are clarified using a descriptive form.

* Describe the situations in which BufferOverflow options are useful

* Expand the documentation for channel consumption functions

Added explanations of what exactly happens on each code path,
how these operators ensure that all elements get processed
eventually, and provided some usage examples.

* Specify the behavior of Channel.consumeEach on scope cancellation

* Extend the documentation for `ProducerScope.awaitClose`

Filed #4149

* Reword a misleading statement in the `produce` documentation

Currently, the documentation states that uncaught exceptions will
lead to the channel being closed.
"Uncaught exceptions" is a special thing in kotlinx.coroutines:
<https://kotlinlang.org/docs/exception-handling.html#coroutineexceptionhandler>
These are not just exceptions that are not wrapped in a try-catch,
these are exceptions that can not be propagated to a root coroutine
via structured concurrency.

Fixed the wording and added a test that shows that uncaught
coroutine exceptions are not handled in any special manner.

* Document `awaitClose` and `invokeOnClose` interactions

Turns out, only a single invocation of either `awaitClose` or
`invokeOnClose` is allowed in the lifetime of a channel.
Document that.

* Document how consuming operators handle failed channels

* Document cancelling the coroutine but not the channel of `produce`

* Don't use the magic constant 0 in default parameters of `produce`

Instead, use `Channel.RENDEZVOUS` so that a meaningful constant
is shown in Dokka's output.

* Fix an incorrect statement in `produce` docs

Currently, the docs claim that attempting to receive from a failed
channel fails. However, the documentation for `Channel` itself
correctly states that before `receive` fails, the elements that
were already sent will be processed first. Corrected this and
added a test demonstrating the behavior.

* Add samples to the `produce` documentation and restructure it
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants