feat: Transaction Support for Indexer runFunction Invocations #432

darunrs · 2023-11-28T20:25:42Z

Currently, partial failure of database actions is permitted. This can cause non-deterministic behavior depending on the state of the user's database. To help aid with that, we want to wrap all database calls in a transaction and commit the changes only if no errors occurred.

Users may want to handle errors themselves in some cases, so we only rollback transaction if an uncaught error surfaces to the runner, meaning it was not caught by user code.

In order to support transactions, each invocation of user code must have all queries go through the same client. After committing, the client should be released. Due to this behavior, we need to do away with pools, as we no longer expect any indexer to use more than one client at any point in time.

There is a problem that the database can only support up to a certain number of active clients at a time. Thus, frequent and busy transactions can cause competing indexers to throttle. Eventually, the number of active connections will grow high enough to induce throttling. So, we will want to introduce a solution like pgBouncer soon. I was using 'SELECT sum(numbackends) FROM pg_stat_database;' in psql locally and observed as high as 84 active database connections, which is close to my local DB's max of 97.

I tried my best to reproduce this behavior by backing up many blocks for all indexers as well as having a 90k+ historical backfill for an indexer which makes simple context.db invocations. I did observe a bit of throttling but did not observe any "too many clients' errors. This is probably due to the fact that all indexers don't have their own pool, but instead one client they make connect calls with.

darunrs · 2023-11-28T21:20:29Z

The latest commit has the test code I used to validate switching from pool to client. If you run yarn test dml it will run the DmlHandler tests, which includes that extra test I added at the top. You'll see it overloads the DB. At least, it does on my machine anyway. If you move the DmlHandler.create function call to the outside of the for loop, then the problem goes away.

There's some problems I had with this approach:

I felt there was no point in creating a pool instance for each indexer on launch if all they're going to do is manage a single client anyway.
Every single indexer would get a pool. I couldn't find any resources regarding max pool count, but each pool consumes resources either way. It felt a bit wasteful to have that overhead. If there is a max, then we have another boundary condition now. Now that said, we could instead create a client upon Indexer creation, since a client is more lightweight. We can just call connect and release on it. I can try experimenting with that to see if there's no problems with that approach.
Since every indexer has a pool, the original case of 100 subsequent invocations to DmlHandler.create (not even simultaneous), would come to pass soon when we have enough active indexers. The issue is that instead of maybe waiting a while to connect, we instead get an unrecoverable error. Though, that does give me an idea to concretely verify that my code does in fact not produce that kind of error if enough connect invocations are being made. Will test that.

So, given that, I decided to have each indexer create a client on creation but only connect with it and release it when needed.

This reverts commit 0d07187.

This reverts commit f194092.

This reverts commit 4f8b8af.

darunrs · 2023-11-28T23:41:46Z

We had a team discussion over the changes. Relevant points:

This impacts somewhat invisible behavior and changes implicit behavior of all indexers. The scope is wide and the detectability is low. So, releasing a change like this deserves a lot of scrutiny before release.
PgBouncer (or something like it) is more or less guaranteed as the next step. There's a strong case to simply wait and release this kind of change in its best possible form, which is just implementing a PgBouncer solution.
Integration tests are sorely lacking in this regard to building confidence that changes don't negatively impact things in ways a unit test may not catch.
There's a case to be had regarding velocity vs completeness in terms of feature development. Waiting until a fully featured set of code including PgBouncer vs releasing this now and a PgBouncer update later.

darunrs · 2024-03-18T17:57:26Z

Will re-implement later.

darunrs added 4 commits November 27, 2023 17:00

Support Transactions using DmlHandler Init during runFunction

2866dba

Fixed Connection Issues and Added Tests

ee6b833

Try Selective Connection Start

2d61d51

Polish Implementation and Add More Tests

0d07187

darunrs requested a review from morgsmccauley November 28, 2023 20:25

Test code for overloading db

4f8b8af

darunrs added 4 commits November 28, 2023 15:28

Revert "Polish Implementation and Add More Tests"

f194092

This reverts commit 0d07187.

Revert "Revert "Polish Implementation and Add More Tests""

c60f128

This reverts commit f194092.

Revert "Test code for overloading db"

1dfa9ed

This reverts commit 4f8b8af.

Update logging

444d3f2

darunrs linked an issue Dec 1, 2023 that may be closed by this pull request

Support Transactions for Indexer Code Execution #340

Open

darunrs closed this Mar 18, 2024

darunrs deleted the transactionSupport branch March 18, 2024 17:57

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Transaction Support for Indexer runFunction Invocations #432

feat: Transaction Support for Indexer runFunction Invocations #432

darunrs commented Nov 28, 2023

darunrs commented Nov 28, 2023 •

edited

Loading

darunrs commented Nov 28, 2023 •

edited

Loading

darunrs commented Mar 18, 2024

feat: Transaction Support for Indexer runFunction Invocations #432

feat: Transaction Support for Indexer runFunction Invocations #432

Conversation

darunrs commented Nov 28, 2023

darunrs commented Nov 28, 2023 • edited Loading

darunrs commented Nov 28, 2023 • edited Loading

darunrs commented Mar 18, 2024

darunrs commented Nov 28, 2023 •

edited

Loading

darunrs commented Nov 28, 2023 •

edited

Loading