-
Notifications
You must be signed in to change notification settings - Fork 19
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Arbitrary node labeling #613
Conversation
This is an exciting feature!
|
It's certainly possible. Downsides that I see: The worker's owner wouldn't know which extensions were implemented until the last minute. Hypothetical example: NC supports a skip logic worker, and a protocol includes that functionality on the fifth stage, but does not use custom node labeling. All node-consuming interfaces see that a worker is implemented, so they ask for custom labels needlessly. I suppose we could cache that information somewhere. We may also want to introduce other message types (for example, to negotiate what external data is passed in), so having a single-purpose worker seems cleaner to me. However, I do think we should make the plumbing reusable/generic, which I haven't done in the prototype here. e.g., we shouldn't end up with a separate 'loadWorker' epic for each worker type. Would that take care of the issue?
Maybe! I struggled with this; here are the reasons I went with what you see.
|
Design around the single-node map function
- Add a workerAgent to multiplex messages - Make the user functions ignorant of Worker boilerplate - Remove externalData for now; see #413 - Formally make Node a container; avoid sending label props through ancestors
Chrome/Android supports worker-src as of v59; Android 8.0 uses v58. Legacy client-src didn't seem to work.
b3b762e
to
0a11d1a
Compare
I've rebased onto master & pushed the following updates:
The demo labeler has changed to exercise non-node prop changes. For testing:
|
Would it make sense to pass in the registry (i.e. displayVariable) and have this worker always do labeling? |
In that case, I guess we'd have a 'factory' labeler that provided default labeling when the protocol doesn't include a nodeLabel script? I do like the idea of having a single mechanism to label nodes, but there's enough overhead here (and this operation is common enough) that it feels wrong. Performance, scalability, battery use, etc. are all a little worse. (It's worth noting that node rendering happens more frequently when a worker is used, since any change in the network can cause the label to change.) It's a cost worth paying for flexibility & safety, but it's not the common case. That's just my gut feeling. Also, one nice property with the current setup is that if the worker errors for any reason, that node will fall back to synchronous labeling. So I don't think we'd be able to lose the existing (synchronous) code path. But I'm not sure; if you think that's the right approach, I'm happy to implement it. |
I investigated the "Slowness initializing" issue. Specifically, what I'm seeing upon refreshing a page which renders nodes is a ~250ms delay between the node backgrounds rendering and the labels rendering. (This is only on page load; stage switching, etc. is fine.) ~99% of the latency of setting the worker can be attributed to the To solve the issue, we could ensure that the external script is loaded before the protocol, so that the user would see the spinner during that time. However, that would delay the loading slightly for all interviews (assuming some don't use external scripts). Further, if the first interface doesn't render nodes (e.g, it has an intro screen), or the app is loaded before handing to an interviewee, then this a non-issue for a typical interview flow. My inclination is to do nothing, for that reason. |
Preloading assets might be a useful feature to have elsewhere in the app, so I think it's at least worth exploring as a separate issue? |
I'm happy to work on it here as well. What would the other use cases be for preloading assets? In the case above, I can pretty easily preload the scripts before protocol.json, and nothing else needs to change. If we want to be able to preload a file that's defined inside protocol.json, we might need to have some separate piece of state that the UI can use to know that "everything I need is loaded." Or else have the interface know everything it's looking for in advance. (i.e., show the spinner until both the protocol is loaded and the additional assets are loaded.) Thoughts? |
Some possible other uses might be large datasets or other workers? It's a speculative feature though, so perhaps it should be developed if/when it's needed. |
If the custom labeling script is present, but returns nothing (and no error), can we fall back to the label function? For example, on the name generator to test input types, nodes have no label at all because nodes created with that form have no |
Also, I can't load the protocols in Chrome; android and electron work well. On starting the app:
And then after trying to return to an in progress interview or create a new one:
|
Yes, that makes sense. I'll treat empty responses as errors & document it that way. I assume an empty label is never desired. Right now, it's still possible to get 'undefined', but we could hypothetically catch that with validation, which we couldn't really do with the custom script. |
Love this, Bryan. Its a very elegant solution to a problem you identified which I hadn't really considered. Tested on all platforms and devices, and seems to be working well, with the exception of the two issues you know about:
I only really have two concerns: async labels popping in as a potentially sketchy UX, and the scaling performance of the functions for generating labels themselves. On the first, there's not a great deal we can do, except try to minimize label recalculation. Better to keep the solution generic for now, and optimize later. On the second, this is partly a concern because we make nodes responsible for their own label (like colors), which while I think is the best design choice, results in labels such as "[property] + [globally incrementing counter]" requiring lots of iteration, which might be expensive. Do you think it might be worth adding some stress-test type scenarios to the dev protocol? |
Restore support for browser development (dev-server)
I think from the app side, we need to be mindful of using keys appropriately so that we're not using the same component to render different data. I fixed the one place I saw. Otherwise, the script author needs to be mindful of consistency and [reasonable] performance.
To evaluate scalability, I used the 'mock nodes' feature, and was able to run this with 100-200 nodes without much issue. There are some places we can minimize unnecessary re-renders, which I'm working on elsewhere; the custom labeler didn't seem to be a large issue. I don't think we're expecting more than that in an app-generated network, but please let me know if that's wrong. External data is another possible source of scalability problems (including with the 'global incrementing counter' case). I removed externalData for now (partly because of #413). In the abstract, I would expect to support a separate messaging & caching layer for external data. There's a single worker for all nodes, so a script author could fetch that data once, cache any counters as needed, and then run other calculations as is happening now. Once that's implemented, it would make sense to add a reference a large external data source from the dev protocol for testing. |
I've pushed the following updates:
|
Fall back to the static getLabel function in this case.
97eb146
to
a1ebe33
Compare
Updates look good to me, although last_name is missing from the variable registry when I try to export. |
|
I opened #621 to cover preloading as a later enhancement. |
WIP for initial discussion.
This prototype allows a protocol author to write JavaScript to generate labels for nodes, and a protocol user to run the code without needing to fully trust it. See #610; #472.
Implementation
The user code is written in vanilla JS/ES and packaged with the protocol as node-label-worker.js. When the protocol is loaded (e.g., when an interview starts), the contents of that script are read and made available from a local URL in redux state.
The script is executed by a Web Worker, which is intialized once and can be re-messaged as props change.
The Web Worker provides a sandboxed environment — it cannot access the DOM, the scope of the React app, nor anything that electron provides. A worker communicates through the app through serializable messages.
I've restricted the app's CSP a bit here to prevent cross-origin requests as well. The CSP is respected as long as the data is read in & served from the local (blob) URL. This restricts the script beyond the standard webworker, but removes the risk of data exfiltration, and encourages the worker to be lightweight.
From a script author's point of view, they are implementing the following mapping function (actual inputs may change):
f(node, network, externalData) => label
Behavior
The development protocol contains some examples that demonstrate some use cases from #610. The uncommented mapping function will append an emoji to the node name based on the node's
close_friend
property.If a protocol does not include the custom labeling script, then we fall back immediately to the original label function.
If a web worker errors, then the node falls back to the original label function.
Assumptions / Design
I initially prototyped this by operating on an array of nodes at once, but that has some drawbacks:
Known issues
There are a few risks from a misbehaving script: