You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
On aarch64 machines, go processes crash intermittently with a "bad g". This has only been seen on aarch64. This was uncovered in go integration tests (go_20) in test cases for "signalHandlerStatic" and "signalHandlerStaticStripped". By adding a go routine to constantly write to the console, these test now fail within a small number of iterations (on this branch). To see them ASAP, I'd recommend commenting out the other test cases and running the go_20 signalHandler tests in a loop to see the "bad g" failure.
We may not have a perfect understanding, but we believe that when a signal happens while we are on the switched stack (for pcre2), go tries to retrieve the "g" from the stack. It's not going to find "g" on the stack when we're currently on our own stack executing pcre2 code, so go crashes. Without the new go routine that writes to the console, the "bad g" happens very infrequently, but importantly "bad g" in the signalHandlerStatic tests has also been observed on branches that do not have the stack pool implementation here. To the best of our knowledge, we could see this any time a signal handler interrupts us while we're running our c code.
On aarch64 machines, go processes crash intermittently with a "bad g". This has only been seen on aarch64. This was uncovered in go integration tests (go_20) in test cases for "signalHandlerStatic" and "signalHandlerStaticStripped". By adding a go routine to constantly write to the console, these test now fail within a small number of iterations (on this branch). To see them ASAP, I'd recommend commenting out the other test cases and running the go_20 signalHandler tests in a loop to see the "bad g" failure.
We may not have a perfect understanding, but we believe that when a signal happens while we are on the switched stack (for pcre2), go tries to retrieve the "g" from the stack. It's not going to find "g" on the stack when we're currently on our own stack executing pcre2 code, so go crashes. Without the new go routine that writes to the console, the "bad g" happens very infrequently, but importantly "bad g" in the signalHandlerStatic tests has also been observed on branches that do not have the stack pool implementation here. To the best of our knowledge, we could see this any time a signal handler interrupts us while we're running our c code.
If this description is correct, we think a possible solution is to mimic how go knows if c code is currently running.
https://github.com/golang/go/blob/master/src/runtime/cgocall.go
https://github.com/golang/go/blob/master/src/runtime/asm_arm64.s
The text was updated successfully, but these errors were encountered: