Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sync ephemeral presence data #207

Open
wants to merge 26 commits into
base: master
Choose a base branch
from
Open

Conversation

gkubisa
Copy link
Contributor

@gkubisa gkubisa commented Apr 26, 2018

This PR adds OT-aware ephemeral data sync as described by @nateps here.

How It Works

Server

The server-side changes are trivial because I reuse all the existing infrastructure for subscribing to docs and publishing ops. So, when you're subscribed to a document, you can publish and receive presence data. The server is essentially a PubSub system for presence data. Apart from that, its other important function is to broadcast null presence data on behalf of the client when it unsubscribes or disconnects, so that its peers know the client is gone.

The only issue I see with this solution is that if a server crashes or PubSub (eg Redis) fails, then the null presence would not be delivered and the peers would not know the client is gone. Considering that this is going to be rare in practice and user impact is likely minimal, I'm not sure, if it is worth the effort. Anyway, I think the solution is to simply expire peer presence data after some time, if they don't send regular heartbeat messages.

Client

Handling Current Presence Data

The current local (this client) and remote (peers) presence data is in doc.presence, which is also a part of the public API. Whenever any operation is applied to the local document, doc.presence is automatically transformed, so that for example a cursor would move forward by one character, if a user typed one character. In case of a hard rollback this data is wiped and syncing must start from scratch.

Publishing Local Presence Data

Publishing is delayed until the document is subscribed to and all pending writes have completed. When we're finally ready to publish, we can simply send the local data stored in doc.presence.

When publishing for the first time, or after subscribe, or reconnect, or hard rollback, a special flag (r - requestReplyPresence) is set on the message to ask all peers to publish their own presence data as soon as possible. This allows new clients to initialize. This system could be optimized a bit by sending the "reply" presence only to the client who requested it, instead of broadcasting. I left it out for now because I'm not sure if it is worth the effort.

Receiving Peer Presence Data

There are a few scenario to handle here...

  1. If we get an out-of-sequence message, we ignore it.
  2. If we get a null presence data from the server (version == null - peer disconnected or unsubscribed), we clear the relevant data in doc.presence.
  3. If we get presence data for a future version of the document, we try to fetch the missing ops first.
  4. If we get presence data for the current version of the document, we transform it against any inflight and pending ops and then store in doc.presence.
  5. If we get presence data for an older version of the document, we transform it against the old ops, inflight op and pending ops, and then store the result in doc.presence.

A few things to note:

  • transforming presence against create, del or no-op makes presence data null
  • full presence data is sent every time. It greatly simplifies the implementation, and should have minimal impact on performance, considering that presence data is expected to be small in terms of bytes.
  • I cache past ops for 1 minute to enable transformation of presence data, if necessary. Considering that presence data is supposed to be real-time, I think we can safely discard that which arrives for versions more than 1 minute old.
  • I cache received presence data for 1 minute, so that I can compare the sequence numbers of newly received presence data and reject out-of sequence messages. In practice this should eliminate the risk of race conditions without introducing a risk of a memory leak (caused by peers constantly sending presence data and disconnecting).

The problem was that unsubscribe re-added the doc to the connection.
Now the doc is removed from the connection after unsubscribe.
Additionally, we're no longer waiting for the unsubscribe response
before executing the callback. It is consistent with Query,
unsubscribe can't fail anyway and the subscribed state is updated
synchronously on the client side.
I had to add the --exit flag workaround to mocha.opts to make it exit
when tests are done. A better long-term solution would be to ensure
that nothing keeps node running after all tests are done, see
https://boneskull.com/mocha-v4-nears-release/#mochawontforceexit.
The problem was that unsubscribe re-added the doc to the connection.
Now the doc is removed from the connection after unsubscribe.
Additionally, we're no longer waiting for the unsubscribe response
before executing the callback. It is consistent with Query,
unsubscribe can't fail anyway and the subscribed state is updated
synchronously on the client side.
@coveralls
Copy link

coveralls commented Apr 26, 2018

Coverage Status

Coverage increased (+0.1%) to 96.629% when pulling 428c46a on Teamwork:sync-presence into 68bde00 on share:master.

lib/agent.js Outdated

Agent.prototype._presence = function(presence, callback) {
if (presence.seq <= this.maxPresenceSeq) {
return callback(new ShareDBError(4026, 'Presence data superseded'));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good practice to make the callbacks async no ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, thanks 👍

return {
a: 'p',
src: this.clientId,
seq: seq != null ? seq : this.maxPresenceSeq,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!= null checks for null AND undefined. !== null would check only for null.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

right sorry I never use equality operator

test/mocha.opts Outdated
@@ -1,3 +1,4 @@
--reporter spec
--check-leaks
--recursive
--exit
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better close correctly everything, this could hide a problem. I felt into the trap already ;-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I completely agree. The existing tests did not close everything properly, so I just added this flag after upgrading mocha to get the old behaviour and keep the tests passing. In this PR I focused on adding presence data sync. Fixing tests to close everything properly is a separate issue to me.

return {
a: 'p',
src: this.clientId,
seq: seq != null ? seq : this.maxPresenceSeq,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

!= null checks for null AND undefined. !== null would check only for null.

@rawkode
Copy link

rawkode commented Jun 2, 2018

I'd love to see this merged to the project. Has there been any discussion regarding this functionality?

@curran
Copy link
Contributor

curran commented Jun 2, 2018

This looks good to me!

The only remaining issue IMO is the merge conflicts. @gkubisa Would you be able to resolve these merge conflicts? Thanks!

@rawkode
Copy link

rawkode commented Jun 3, 2018

@gkubisa If you don't have the time to rebase this, I'd be happy to pick it up

@gkubisa
Copy link
Contributor Author

gkubisa commented Jun 5, 2018

I resolved those merge conflicts now and the branch is ready to merge IMHO.

@curran
Copy link
Contributor

curran commented Jun 10, 2018

@nateps I'm curious what your initial thoughts are on this PR? Thank you for your time.

@gkubisa
Copy link
Contributor Author

gkubisa commented Jun 11, 2018

I've just open-sourced Teamwork/ot-rich-text to provide a concrete example of an OT type supporting presence data. We use it at Teamwork.com for real-time rich-text collaboration.

@nateps , would you be able to review/merge this PR some time soon, please?

@curran
Copy link
Contributor

curran commented Jun 12, 2018

@gkubisa It's awesome to see the great work!

I'm considering switching our dependency to @teamwork/sharedb, as this project is clearly suffering. This PR introduces such a useful feature, I'm surprised not a single maintainer has commented on it in 8 weeks.

@gkubisa
Copy link
Contributor Author

gkubisa commented Jun 12, 2018

I'm surprised not a single maintainer has commented on it in 8 weeks.

Yes, this is really disappointing. :-(

I'm considering switching our dependency to @teamwork/sharedb.

If you do switch, then you might also want to use the other related forks: https://www.npmjs.com/search?q=%40teamwork%2Fsharedb and https://www.npmjs.com/package/@teamwork/websocket-json-stream. The updated docs for OT types are here: https://github.com/teamwork/ot-docs.

Re the future of my fork, I'm going to add local undo/redo very soon. Afterwards I'll most likely just fix bugs and merge PRs.

@dcharbonnier
Copy link
Contributor

I'm in @gkubisa

@brainkim
Copy link

😭 why can’t we all just get along 😤

@alecgibson
Copy link
Collaborator

NB: We have a repo that ties presence into Quill here: https://github.com/reedsy/quill-cursors

Could be interesting to compare implementations.

@ericyhwang
Copy link
Contributor

Re presence for json0, it'd be nice to have full support eventually. That is a non-trivial chunk of work, as @curran points out.

I wonder if it's feasible to do a partial implementation initially, like only support presence for top-level text0 fields first and error on anything else. I haven't dove too deeply yet into this PR to know if that's advisable.

@gkubisa
Copy link
Contributor Author

gkubisa commented Nov 1, 2018

I wonder if it's feasible to do a partial implementation initially, like only support presence for top-level text0 fields first and error on anything else.

Given that json0 is so generic, perhaps presence on content embedded in json0 is all we could reasonably do? In that case json0 would just delegate presence handling to the appropriate embedded types. Probably the only presence transformations in json0 itself would be related to list operations, so that the correct embedded type would receive the presence later. Anyway, it seems very doable with some effort.

@curran
Copy link
Contributor

curran commented Nov 2, 2018

That makes sense. I'm interested in adding presence for text0 fields within nested json0 objects.

@curran
Copy link
Contributor

curran commented Apr 9, 2019

Hey check this out - @houshuang is working on JSON0 presence!

Here's the PR ottypes/json0#25

@houshuang
Copy link

In fact, that PR was submitted in error - I was trying to submit against a different upstream :) I thought I closed it, but I guess I didn't. Anyway we are absolutely happy to share this work, just not quite done yet (but feedback welcome!). Here's a quick demo video with Quill and presence/shared cursors. https://www.youtube.com/watch?v=CVjH56Q18cU

@curran
Copy link
Contributor

curran commented Apr 10, 2019

I started working on json0 presence here datavis-tech/json0#2

@curran
Copy link
Contributor

curran commented Apr 11, 2019

I asked @nateps et. al. "What's your take on presence?" during the last video call PR review meeting, and the ensuing conversation was completely new information to me. In summary, Nate expressed concern for scalability, and explained the problem as follows:

  • The server does not cache presence data.
  • Every time a new user joins and requests up-to-date presence, the server asks all other clients for their presence, resulting in a client -> server -> remote client -> server -> client round trip, for all remote clients.
  • This is fine for a few users, but if thousands of users join, performance would be problematic.
  • Because of this, namespacing the implementation internally would leave room for future implementations that have better performance characteristics.

I haven't dug into this PR enough to verify that all of this is true, but I just wanted to bump the conversation here with the latest thoughts from the Lever side.

My gut feeling is that the server could/should be made to cache presence data for all documents, probably in memory for a bounded period of time.

@gkubisa
Copy link
Contributor Author

gkubisa commented Apr 11, 2019

Nate's explanation is 100% valid. I implemented it the way it is because:

  • I assumed that the number of users editing the same document concurrently is not going to be huge in practice - max a few dozen. It is a valid assumption for the documents we manage in my company.
  • Caching and transforming presence on the server and reconciling it with the clients seemed very complex.
  • I had concerns about the server-side scalability of caching and transforming presence on the server.
  • I wanted to avoid stale presence data persisting on servers in case one server crashed and did not clean up the presence it managed.
  • I did not want to store presence as part of operations due to performance reasons.

@houshuang
Copy link

I started a write-up to better understand how this works here: https://docs.google.com/document/d/1lxBa74_sfkhaLg0s7S_8yIcboaNZvE6TNHtUYnhAlNc/edit# - note that this is intermingling a bit the concerns between the ShareDB presence mechanisms, how we are proposing to implement ot-json0 with subtypes etc. It already benefited from some clarifications from @gkubisa (thanks!)

@curran curran mentioned this pull request Apr 12, 2019
Copy link
Contributor

@curran curran left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I really would love to see this merged, so the larger community can develop presence work in downstream projects with confidence (e.g. OT types, editor bindings), I put in some time to do a thorough review. The thrust of my review is to narrow down the specific changes required in order to satisfy the original two requests from @nateps and @ericyhwang, which are as follows:

Requests for the current PR:

  • Add a ShareDB constructor option/flag for enabling presence. It looks like currently, some things like op caching could affect performance, even if you never call doc.submitPresence. It'd be good to not run that type of code if you're not using presence, like if none of your doc types don't support it.
  • Nate would like all the presence-related info to be nested underneath a top-level property, to make it easier to inspect what's related to presence. The current presence map might be doc.presence.currentDataByClient = {'src1': presenceData1, ...}.

An acknowledgement of the review's correctness/completeness from @ericyhwang and @nateps would be helpfup, and may inspire me to submit a follow-on PR that implements the changes, so that we can see this merged (I'm not sure if @gkubisa has the time at the moment to devote to this). Thanks everyone!

@@ -0,0 +1,9 @@
root = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This .editorconfig file seems extranious and IMO should be removed.

- "8"
- "6"
- "4"
script: "npm run jshint && npm run test-cover"
script: "ln -s .. node_modules/sharedb; npm run jshint && npm run test-cover"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updates to .travis.yml seem out of scope for this PR. The changes are good, but probably should be kept in a separate PR.

@@ -223,6 +228,9 @@ Unique document ID
`doc.data` _(Object)_
Document contents. Available after document is fetched or subscribed to.

`doc.presence` _(Object)_
Copy link
Contributor

@curran curran Apr 15, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is where the following comment from @nateps and @ericyhwang apply:

Nate would like all the presence-related info to be nested underneath a top-level property, to make it easier to inspect what's related to presence. The current presence map might be doc.presence.currentDataByClient = {'src1': presenceData1, ...}.

More concretely, the docs here could change from doc.presence to doc.presence.currentDataByClient, or whatever name is preferable.

The name doc.presence.current would feel familiar for React developers, as it's similar to ref.current.

// The current presence data
// Map of src -> presence data
// Local src === ''
this.presence = Object.create(null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: nesting, suggested change: doc.presence --> doc.presence.currentDataByClient

this.presence = Object.create(null);
// The presence objects received from the server
// Map of src -> presence
this.receivedPresence = Object.create(null);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re: nesting, suggested change: doc.receivedPresence --> doc.presence.received

this._sendOp();
}

if (this.subscribed && !this.inflightPresence && this.pendingPresence && !this.hasWritePending()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(ShareDB constructor presence flag)

var op = this.inflightOp;
var pending = this.pendingOps;
var callbacks = [];
if (this.inflightPresence) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could be better grouped under if(ShareDB constructor presence flag)

this._setType(null);
this.version = null;
this.inflightOp = null;
this.pendingOps = [];
this.cachedOps.length = 0;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

if(ShareDB constructor presence flag)


// *** Presence

Doc.prototype.submitPresence = function (data, callback) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All of these methods could be definde only if(ShareDB constructor presence flag)

@@ -16,7 +16,7 @@
"expect.js": "^0.3.1",
"istanbul": "^0.4.2",
"jshint": "^2.9.2",
"mocha": "^3.2.0",
"mocha": "^5.1.1",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mocha upgrade should be separate PR.

@gkubisa
Copy link
Contributor Author

gkubisa commented Apr 15, 2019

Thanks for the review @curran . Unfortunately, I currently have no time to work on it. Feel free to continue the work on this feature in your own fork. :-)

@curran
Copy link
Contributor

curran commented Apr 16, 2019

Thanks for letting me know @gkubisa . I may be able to continue this work.

I did get presence working in a fork of json0, continuing the work started by @houshuang .

Here's a demo app: https://github.com/datavis-tech/json0-presence-demo

presence

selections

@curran
Copy link
Contributor

curran commented Apr 17, 2019

@gkubisa Are there any presence-related changes in the Teamwork fork of ShareDB that are not in this PR? I'm starting in on the work of a new PR from this one, and wanted to make sure it's not missing anything. Thanks!

@curran curran mentioned this pull request Apr 17, 2019
@gkubisa
Copy link
Contributor Author

gkubisa commented Apr 17, 2019

Great that you're taking it on. This branch should contain all presence-related changes, @curran .

@curran
Copy link
Contributor

curran commented Apr 17, 2019

Excellent. Thanks for confirming! Work has commenced in #286 .

@curran
Copy link
Contributor

curran commented Jul 22, 2020

Suggest to close as stale. Other presence PR has landed #322

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

10 participants