Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pass user information with the op to remote clients. #513

Closed
wants to merge 0 commits into from

Conversation

maxfortun
Copy link

Adds feature [#512]

Use case: Visually track change attribution across remote clients.

[#224] allows to store metadata, which contains userId, but falls short of pushing metadata out to remote clients.

Proposed solution:

On a server side:

Retain backend constructor options in order to pass around options.opsOptions to configure metadata behavior.

Include message.m in agent's _sendOp based on backend.options.opsOptions.metadata configuration.

Include op.m in SubmitRequest based on backend.options.opsOptions.metadata configuration.

On a client side:

Emit op.m with every op related event.

this.emit('op', componentOp.op, source, op.src, op.m);

lib/agent.js Outdated
@@ -260,7 +260,9 @@ Agent.prototype._sendOp = function(collection, id, op) {
if ('op' in op) message.op = op.op;
if (op.create) message.create = op.create;
if (op.del) message.del = true;

if(op.m && this.backend.options && this.backend.options.opsOptions && this.backend.options.opsOptions.metadata) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure about this all-or-nothing approach to toggling metadata. There's a very good chance there's some metadata you may or may not want to send back to users for security reasons.

This may be collection-dependent. It may even be doc-dependent, I suppose. Or perhaps consumers should be able to project the metadata, etc.

My gut feel for this is that it should be settable in the middleware like suppressPublish or saveMilestoneSnapshot, so that consumers can be as granular as they like about metadata.

I'd probably expect the API to look something like:

backend.use('commit', (context, next) => {
  // Can set some arbitrarily complex condition
  if (context.collection === 'foo') {
    // Only give us the `bar` property on the metadata
    context.opMetadataProjection = {bar: true};
  } else {
    // Otherwise, context.opMetadataProjection defaults to `null`,
    // and we send back no metadata
  }

  next();
});

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. Let me rework my pull request with this concept. Thank you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One question, is there a way to reach context from Agent.prototype._sendOp? If not, we'd need to set context.agent.opMetadataProjection.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? I'm not sure I'm clear on the scope here: are we trying to get metadata to the client for newly-submitted ops, or for all ops?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The confusion may be due to my lack of familiarity with the code. Please forgive me. It looks like in order to be able to generate an outbound projection inside Agent.prototype._sendOp, it needs to have access to the projection configuration set in context.opMetadataProjection. Are you suggesting to always message.m = Object.assign({}, op.m); inside Agent.prototype._sendOp and handle the actual projection inside commit/submit ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, that's exactly what I'm suggesting. I think that should hopefully solve your issue? Let me know if I've misunderstood.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alecgibson , you just simplified my code tenfold. Thanks. It now looks like this:

wss.on('connection', function(ws, req) {
    let metadata = {};
    try {
        let userInfo = JSON.parse(Buffer.from(req.headers["x-oidc-user-info"], 'base64').toString('ascii'));
        metadata.username = userInfo.preferred_username;
    } catch(e) {
        console.log("Failed to parse user info", req.headers["x-oidc-user-info"], e);
    }
    console.log("wss connected", metadata);

    backend.use('submit', (context, next) => {
        Object.assign(context.op.m, metadata);
        next();
    });

    let stream = new WebSocketJSONStream(ws);
    backend.listen(stream);
});

lib/backend.js Outdated
@@ -22,6 +22,8 @@ function Backend(options) {
emitter.EventEmitter.call(this);

if (!options) options = {};
this.options = options;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably shouldn't do this, because:

  • it's inconsistent to previous behaviour
  • consumers might expect that updating properties in options will lead to altered behaviour on other options (which it won't)

// NB: If we need to add another argument to this event, we should consider
// the fact that the 'op' event has op.src as its 3rd argument
this.emit('before op batch', op.op, source);
this.emit('before op batch', op.op, source, op.m);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we do this, we should probably add op.src for consistency with the 'op' event arguments. Same goes for all the others like this.

@@ -620,28 +621,28 @@ Doc.prototype._otApply = function(op, source) {
if (transformErr) return this._hardRollback(transformErr);
}
// Apply the individual op component
this.emit('before op', componentOp.op, source, op.src);
this.emit('before op', componentOp.op, source, op.src, op.m);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I regret not having wrapped op.src in some sort of "other" metadata object. Maybe it's time to do that now so we don't keep inflating this list of arguments indefinitely.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be open to this signature, where the 3rd param is op.src, and 4th param is the whole op? This way whoever needs whatever from op have access to that whatever?

this.emit('before op', componentOp.op, source, op.src, op);

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think @ericyhwang had suggested something similar to this in the past. I'm not necessarily against it, although I'd potentially still bury it in a wrapper object in case we need more arguments

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have adjusted the PR to align emit signatures and wrap the op.

@@ -587,9 +587,10 @@ Doc.prototype._otApply = function(op, source) {
);
}


Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Any reason in particular for extra whitespace?)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... How did this get in here?

Copy link
Collaborator

@alecgibson alecgibson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we also get some tests for this, please?

@@ -171,7 +171,6 @@ SubmitRequest.prototype.commit = function(callback) {
var op = request.op;
op.c = request.collection;
op.d = request.id;
op.m = undefined;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This isn't quite the API I'd suggested. By default, all ops will be returned with metadata, which breaks our current default behaviour. I think we should specify a mapping in middleware, which we apply here.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's talk this through. Current default behavior is no metadata. What we want is to add metadata when explicitly requested in middleware, like so:

backend.use('submit', (context, next) => {
  Object.assign(context.op.m, {foor: bar});
  next();
});

The only metadata set in the entirety of the code is here.

SubmitRequest.prototype._addOpMeta = function() {
  this.op.m = {
    ts: this.start
  };
  if (this.op.create) {
    // Consistently store the full URI of the type, not just its short name
    this.op.create.type = ot.normalizeType(this.op.create.type);
  }
};

Considering that we always remove metadata, and never really use it, maybe the only thing we need to do to maintain full backward compatibility is to get rid of this.op.m.ts assignment in SubmitRequest.prototype._addOpMeta ? It doesn't seem to be used anywhere. And if someone needs a ts they can explicitly assign it in middleware?

backend.use('submit', (context, next) => {
  Object.assign(context.op.m, {foor: bar, ts:  Date.now()});
  next();
});

This also means that no special mapping logic is required, since the assignment in the middleware serves the same purpose.
Thoughts?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I was wrong. Found ts being used here.

function filterOpsInPlaceBeforeTimestamp(ops, timestamp) {
  if (timestamp === null) {
    return;
  }

  for (var i = 0; i < ops.length; i++) {
    var op = ops[i];
    var opTimestamp = op.m && op.m.ts;
    if (opTimestamp > timestamp) {
      ops.length = i;
      return;
    }
  }
}

Need to think a bit more...

Copy link
Collaborator

@alecgibson alecgibson Aug 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The other thing that m is currently used for is attaching metadata to ops to be persisted in the database. For example, consider this middleware:

backend.use('commit', (context, next) => {
  context.op.m.userId = context.agent.custom.userId;
  next();
});

The current behaviour is that this metadata is attached to the op, and then submitted to the database (if you're using sharedb-mongo, you'll see this persisted in your o_XXX collection as an m property, alongside the aforementioned ts). This metadata is then scrubbed before sending it to remote clients (the op.m = undefined line you've removed).

So although ShareDB internals don't expressly use m that much, consumers are (myself included!), so broadcasting this metadata to other clients is breaking behaviour, and not necessarily desired.

I don't understand what's wrong with the projection approach I'd suggested?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like we can't avoid adding a projection logic. Your initial suggestion seems to be the best way to go. Just pushed another commit. Presently the usage would look something like this:

        backend.use('connect', (context, next) => {
                Object.assign(context.agent.custom, {foo:bar});
                next();
        });

        backend.use('submit', (context, next) => {
                Object.assign(context.op.m, context.agent.custom);
                context.opMetadataProjection = { foo: true };
                next();
        });

As far as tests, I will try to figure out how to add them as soon as we finalize the direction we are heading in.
Is this inline with your thinking?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alecgibson, just to clarify, it was not my intention to suggest that something is wrong with your approach. I am not that familiar with the code to even think in such terms. Please forgive my ignorance. My goal was to find the minimal backward compatible change possible in order to get the functionality working. As I spent more time diving through code it became clear that adding opMetadataProjection feature is that minimal backward compatible change. It was your initial suggestion that lead me in that direction and gave me better understanding of the inner workings. Thank you.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries at all; was just checking that it does actually solve your problem or if I'd missed something.

I'm happy with this approach in general (since I suggested it 😛), but let's confirm next week; we have a weekly Pull Request meeting on Tuesdays, so we'll discuss there and get back to you.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alecgibson , did you have a weekly PR meeting on Tuesday?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@maxfortun we had to cancel the meeting this week because of have, but I've asked @ericyhwang to take a look when he gets a chance and we can discuss directly on the PR

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you.

continue;
}

if (typeof doProject === 'function') {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not entirely sure we need this bonus behaviour for a first implementation, unless you have some special use-case you need to tackle? Probably best to keep the API footprint as small as possible.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's keep the footprint small. Updated the commit.

lib/backend.js Outdated
@@ -197,7 +197,7 @@ Backend.prototype.trigger = function(action, agent, request, callback) {
if (err) return callback(err);
var fn = fns.shift();
if (!fn) return callback();
fn(request, next);
fn(request, callback);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move this change to a separate PR — I haven't got the headspace to think about it right now (hopefully a bit more time next week), but I think:

  • it's not conceptually part of this change
  • it might need further discussion, and I don't want the two changes to hold one another up

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am at conferences this and next week, so will not have time to refactor:( Will go through this right after.

Copy link
Author

@maxfortun maxfortun Sep 13, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @alecgibson , I am a bit puzzled as I am not sure how to create 2 separate PRs from a single fork. Any ideas?
In mean time I can revert this change if we can proceed with the rest of the PR.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You should be able to just create a second branch on your fork, and raise a PR from there. You don't need to raise PRs from master.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@alecgibson , thank you. Moved into a separate PR [#517].

Copy link
Contributor

@ericyhwang ericyhwang left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall approach sounds fine to me.

Please add some tests for the projection - you can run the test suite locally with npm test.

We recommend adding your tests inside test/client/submit.js, which tests the codepath you're modifying.

}

// Full projection
return request.op.m;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you specifically need the ability to get all metadata?

If it's not something you need at the moment, I'd leave this out for now, returning undefined as the fallback case.

Requesting just what's needed is better, so if we leave out a "get-everything" option, that removes the temptation to request all the metadata.

@maxfortun
Copy link
Author

@ericyhwang I added a few tests to test/client/submit.js. Hopefully these are acceptable.

}
this.emit('op batch', op.op, source);
this.emit('op batch', op.op, source, {op: op});
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
this.emit('op batch', op.op, source, {op: op});
this.emit('op batch', op.op, source, op.src, {op: op});

doc = connection.get('dogs', 'fido');
doc.create({name: 'fido'}, function() {
doc.on('op', function(op, source, src, context) {
expect(context.op.m).equal(undefined);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm assuming this is acceptable for your use case?

});
});

it('concurrent changes', function(done) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure I understand what we're trying to test here?

We also seem to be missing a simple test that remote clients receive the projected metadata (the whole point of this PR!).

@maxfortun
Copy link
Author

@alecgibson , I am trying to replicate the test failures on my local machine, and the tests keep passing. Any idea how do I run a test in exactly the same env as this repo does?

@alecgibson
Copy link
Collaborator

@maxfortun

  1. Make sure you're on the latest LTS Node to match the CI environment (14.18.1)
  2. Remove node_modules and package-lock.json
  3. Reinstall with npm install
  4. Run the tests npm test

@maxfortun
Copy link
Author

@alecgibson that is precisely what I did several times:

$ node --version                                                                           
v14.18.1

$ npm install
<...snip...>
added 375 packages from 588 contributors in 12.972s

31 packages are looking for funding
  run `npm fund` for details

$ npm test

> [email protected] test /home/fortunm/projects/sharedb
> mocha
<...snip...>
      metadata projection
        ✓ passed metadata to connect
        ✓ passed metadata to submit
        ✓ received local op without metadata
        ✓ concurrent changes (204ms)
<...snip...>
  615 passing (4s)

I tried reproducing test failures but they keep passing... Is there some sort of a timeout that I am exceeding because of the concurrent test?

@maxfortun
Copy link
Author

git clone https://github.com/share/sharedb.git
cd sharedb
gh pr checkout 513
npm i
npm test

Decided to do a clean test. This one is on mac. Still passes.

@alecgibson
Copy link
Collaborator

@maxfortun sorry about the confusion: the build is a little weird. It's not actually the sharedb test command that's failing, it's the sharedb-mongo suite that's failing.

The sharedb-mongo build relies on tests exported by sharedb. Because of this, we actually also run the sharedb-mongo tests as part of the build to catch any potentially breaking changes to downstream dependencies like our database drivers.

In order to replicate the failure, you'll have to clone sharedb-mongo, and test against your local branch by running:

cd sharedb-mongo
npm install path/to/sharedb
npm install
npm test

I've tested locally and can confirm the failure. It's failing because you get done() called multiple times, which is happening because of an error triggering a rollback, and therefore firing your 'op' listener twice:

op [ { p: [ 'tricks' ], oi: [ 'fetch' ] } ]
        ✓ received local op without metadata
rollback Error: Already closed
    at wrapErrorData (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/connection.js:263:13)
    at Connection.handleMessage (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/connection.js:191:11)
    at StreamSocket.socket.onmessage (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/connection.js:142:18)
    at /Volumes/MacData/git/work/reedsy/sharedb/lib/stream-socket.js:60:12
    at processTicksAndRejections (internal/process/task_queues.js:77:11) {
  code: 5101,
  data: {
    a: 'op',
    c: 'dogs',
    d: 'fido',
    v: 1,
    seq: 2,
    x: {},
    op: [ [Object] ]
  }
}
op [ { p: [ 'tricks' ], od: [ 'fetch' ] } ]
  1) db
       client submit
         metadata projection
           received local op without metadata:
     Error: done() called multiple times
      at Doc.<anonymous> (/Volumes/MacData/git/work/reedsy/sharedb/test/client/submit.js:1280:15)
      at Doc._otApply (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/doc.js:643:10)
      at Doc._rollback (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/doc.js:967:12)
      at Doc._handleOp (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/doc.js:322:19)
      at Connection.handleMessage (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/connection.js:245:20)
      at StreamSocket.socket.onmessage (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/connection.js:142:18)
      at /Volumes/MacData/git/work/reedsy/sharedb/lib/stream-socket.js:60:12
      at processTicksAndRejections (internal/process/task_queues.js:77:11)

You can just use the op callback instead of the event:

        var connection = this.backend.connect(undefined, metadata);
        var doc = null;
        connection.on('connected', function() {
          expect(connection.agent.custom).eql(metadata);
          doc = connection.get('dogs', 'fido');
          doc.create({name: 'fido'}, function() {
            doc.submitOp([{p: ['tricks'], oi: ['fetch']}], {source: 'trainer'}, function(error) {
              if (error) return done(error);
              expect(context.op.m).equal(undefined);
              done();
            });
          });
        });

And in that case, you'll then see that the root cause is actually your assertion is failing:

  1) db
       client submit
         metadata projection
           received local op without metadata:
     Uncaught TypeError: Cannot read property 'm' of undefined
      at /Volumes/MacData/git/work/reedsy/sharedb/test/client/submit.js:1279:33
      at /Volumes/MacData/git/work/reedsy/sharedb/lib/util.js:74:7
      at Array.forEach (<anonymous>)
      at Object.exports.callEach (/Volumes/MacData/git/work/reedsy/sharedb/lib/util.js:72:13)
      at Doc._clearInflightOp (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/doc.js:1016:21)
      at Doc._opAcknowledged (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/doc.js:940:8)
      at Doc._handleOp (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/doc.js:332:10)
      at Connection.handleMessage (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/connection.js:245:20)
      at StreamSocket.socket.onmessage (/Volumes/MacData/git/work/reedsy/sharedb/lib/client/connection.js:142:18)
      at /Volumes/MacData/git/work/reedsy/sharedb/lib/stream-socket.js:60:12
      at processTicksAndRejections (internal/process/task_queues.js:77:11)

@maxfortun
Copy link
Author

@alecgibson , that sharedb-mongo explanation did the trick. Thank you.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.04%) to 97.331% when pulling 28bf120 on maxfortun:master into 3e636d0 on share:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants