Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM crash in ~30 seconds after Terminal.initTerminal invocation #436

Closed
SergeyVinyar opened this issue Mar 4, 2024 · 8 comments
Closed

Comments

@SergeyVinyar
Copy link

SergeyVinyar commented Mar 4, 2024

Summary

For our unpublished yet app we've upgraded the SDK from 2.16.0-b1 to 3.2.1 and started registering crashes in ~30 seconds after Terminal.initTerminal(...) invocation with a stacktrace

Fatal Exception: java.lang.OutOfMemoryError: Failed to allocate a 1445090328 byte allocation with 11213247 free bytes and 181MB until OOM, target footprint 22426495, growth limit 201326592
       at com.squareup.tape2.QueueFile$ElementIterator.next(QueueFile.java:549)
       at com.squareup.tape2.QueueFile$ElementIterator.next(QueueFile.java:514)
       at com.stripe.jvmcore.batchdispatcher.collectors.QueueFileCollector$peek$2.invokeSuspend(QueueFileCollector.kt:1855)
       at kotlin.coroutines.jvm.internal.BaseContinuationImpl.resumeWith(ContinuationImpl.kt:33)
       at kotlinx.coroutines.DispatchedTask.run(DispatchedTask.kt:108)
       at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1167)
       at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:641)
       at java.lang.Thread.run(Thread.java:923)

Often we can make a successful test payment with a card reader but sometimes the app falls into a loop:

  1. A user starts the app
  2. The app invokes Terminal.initTerminal(...) in Application.onCreate
  3. After 30 seconds - crash (no other Stripe SDK methods are invoked)

And it repeats until the app is removed and installed again.
All exceptions in the loop have the same amount of bytes for allocation (1445090328 in this example).

Code to reproduce

Unfortunately, we couldn't find a scenario to reproduce it.

Android version

11, 12

Impacted devices (Android devices or readers)

Galaxy Tab A7 Lite
Galaxy Tab S6 Lite

SDK version

3.2.1
Also when we got into this crash loop, we tried to downgrade the version to 3.0.0 without removing the app from the device, and it still crashed.

Other information

I'm sure this is the same problem as in #434 but in our case we didn't use NFC. Galaxy Tab A7 Lite doesn't have an NFC reader at all.

I suspect the reason is that https://github.com/square/tape/blob/master/tape/src/main/java/com/squareup/tape2/QueueFile.java is not thread safe but the stacktrace includes coroutines which may freely change threads for each suspend method. And the temporary file that QueueFile creates just gets corrupted sometimes.

We invoke:

  • Terminal.initTerminal(...) on a default dispatcher
  • terminal.discoverReaders(...) on a default dispatcher and sometimes on an Android main thread
  • terminal.connectBluetoothReader(...) on an Android main thread (a user clicks "Connect")
  • terminalService.collectPayment(...) on a default dispatcher
@jasells
Copy link

jasells commented Mar 4, 2024

I'm sure this is the same problem as in #434 but in our case we didn't use NFC. Galaxy Tab A7 Lite doesn't have an NFC reader at all.

We have seen it at times with BT, but thought it was a dependency issue, as after updating some of the other packages, it seemed to go away.

This thread from the tape2 repo seems to indicate is a multi-thread-contention issue, I think? So why it comes and goes between builds.. I'm not sure.

@SergeyVinyar Have you succeeded in creating a simpler repro app?

@jasells
Copy link

jasells commented Mar 4, 2024

A user starts the app
The app invokes Terminal.initTerminal(...) in Application.onCreate
After 30 seconds - crash (no other Stripe SDK methods are invoked)

That also fits with #434, as we are calling Terminal.initTerminal(...) and then immediately connecting the NFC reader, so it looks like it is just the timing of ~30 sec after initTerminal()?

@SergeyVinyar
Copy link
Author

@SergeyVinyar Have you succeeded in creating a simpler repro app?

No, it's sporadical and also an app needs a token from the backend to make a payment (and it looks like without a payment we don't get into this issue). I cannot just make a small app that shows the bug.
I think that if we place all SDK methods calls into coroutines with the same single-thread dispatcher, it may help. But I haven't tried yet.

@SergeyVinyar
Copy link
Author

Something like:

val Dispatchers.Stripe: CoroutineDispatcher by lazy {
    Executors.newSingleThreadExecutor().asCoroutineDispatcher()
}

and then:

GlobalScoupe.launch(Dispatchers.Stripe) {
   Terminal.initTerminal(...)
}

@jasells
Copy link

jasells commented Mar 5, 2024

That sounds like a good idea... I'll try that too, as soon as i get some time. I'll post any updates to my issue thread, and I'll watch for anything from you in case you get to it first.

@ugochukwu-stripe
Copy link
Contributor

Hi @SergeyVinyar , we rolled out a fix to mitigate this problem in 3.3.1, could you retry with the latest release and let us know if you're still see this error.

@SergeyVinyar
Copy link
Author

Hi @ugochukwu-stripe, thank you a lot. I'll check

@SergeyVinyar
Copy link
Author

We've done smoke testing. Everything works. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants