Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Some refactoring ideas for Storage client library #1943

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

reuvenlax
Copy link

Remove streamWriterToConnection and connectionToWriteStream maps, and instead store this data in the StreamWriter and ConnectionWorker objects themselves. This means that we no longer have to do a map lookup on every call to append().

Instead of using timestamps to determine which StreamWriter objects to send updated schema to, use a registration method. This way only StreamWriters that were created prior to a schema-update callback will get the updated schema (Note: this uses a static map, but could instead be done by updating the StreamWriter directly). I think this preserves the intended semantics from before, but needs a good look. Note: the timestamp approach isn't guaranteed to work, since it's possible for the time to stay the same between StreamWriter creation and the callback (Java does not guarantee that System.nanoTime() actually updates every nanosecond - it provides no guarantee on update frequency).

Some other things worth experimenting with in the future:

  • See if we can remove the global lock in ConnectionWorkerPool and instead use local locks in StreamWriter and ConnectionWorker. This might reduce lock contention, but will cause us to always grab two locks instead of 1 which might use more CPU. Unclear which approach is better.
  • Right now ConnectionWorkerPool always checks isOverwhelmed on every call to append and looks for a new stream in that case. If we're in a case where all streams are at their maximum, this might cause a lot of extra CPU usage at the time we can least afford it (when the worker is already overwhelmed!). We should consider throttling this - e.g. maybe we only move a StreamWriter to a new stream if it's been > 1 sec since the last time we checked.
  • As mentioned above, there are other ways of dealing with updatedSchema.

@reuvenlax reuvenlax requested a review from a team January 19, 2023 18:28
@reuvenlax reuvenlax requested a review from a team as a code owner January 19, 2023 18:28
@conventional-commit-lint-gcf
Copy link

conventional-commit-lint-gcf bot commented Jan 19, 2023

🤖 I detect that the PR title and the commit message differ and there's only one commit. To use the PR title for the commit history, you can use Github's automerge feature with squashing, or use automerge label. Good luck human!

-- conventional-commit-lint bot
https://conventionalcommits.org/

@product-auto-label product-auto-label bot added size: m Pull request size is medium. api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. labels Jan 19, 2023
@reuvenlax reuvenlax changed the title Some refactoring ideas for Storage client library RFC: Some refactoring ideas for Storage client library Jan 19, 2023
@reuvenlax reuvenlax marked this pull request as draft January 19, 2023 18:29
@@ -170,6 +184,11 @@ String getWriterId(String streamWriterId) {
return connectionWorker().getWriterId();
}

public void register(StreamWriter streamWriter) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be removed?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand the comment?

// TODO: What if we simply kept an atomic refcount in ConnectionWorker? We could also
// manage the refcount in the callback below to precisely track which connections are being
// used.
currentConnection.getCurrentStreamWriters().add(streamWriter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this lock good enough to protect currentConnection?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we're using it to protect currentConnection.getCurrentStreamWriters()

In theory we could put a lock inside of currentConnection, which would create more granular locking. However this would also cause a lot more lock/unlock activity (e.g. every call to append would have to lock at least two locks) so this change would need measurement to see if it better better.

lock.unlock();
}
});
ConnectionWorker currentConnection;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In general, I like this idea. We can reuse the connection more for the same StreamWriter.

TableSchema getUpdatedSchema(StreamWriter streamWriter) {
if (getKind() == Kind.CONNECTION_WORKER) {
return connectionWorker().getUpdatedSchema();
} else {
return connectionWorkerPool().getUpdatedSchema(streamWriter);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this breaks the promise of StreamWriter only saw updates when there is a schema update? I think we should be fine to use nano time since it is monotonic on this machine?https://screenshot.googleplex.com/3Qeo9ouZEnehgMR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think nanoTime completely works. it is monotonic, but not strictly increasing - i.e. the current code is broken if the first update has the same nanoTime as the creation time, which is completely possible.

// TODO: Do we need a global lock here? Or is it enough to just lock the StreamWriter?
lock.lock();
try {
currentConnection = streamWriter.getCurrentConnectionPoolConnection();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we keep multiple (at least 2) connections in order to scale up, and avoid look into the global pool?

ConnectionWorker createdOrExistingConnection = null;
try {
createdOrExistingConnection =
createOrReuseConnectionWorker(streamWriter, currentConnection);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we still need global lock here.

currentConnection = createdOrExistingConnection;
streamWriter.setCurrentConnectionPoolConnection(currentConnection);
// Update connection to write stream relationship.
// TODO: What if we simply kept an atomic refcount in ConnectionWorker? We could also
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

refcount would be error prone as we streamwriter could be switching back and forth between connection workers meaning one worker could be recording a single stream writer multiple times if using refcount

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that would be fine. The refcount removal would happen in the done callback (below in ApiFutures.transform), so would know exactly which connection worker to decrement even if the stream writer has moved to a different stream.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: bigquerystorage Issues related to the googleapis/java-bigquerystorage API. size: m Pull request size is medium.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants