Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Ensure uids exist for all nodes in app #615

Merged
merged 12 commits into from
Aug 1, 2018
Merged

Ensure uids exist for all nodes in app #615

merged 12 commits into from
Aug 1, 2018

Conversation

bryfox
Copy link
Contributor

@bryfox bryfox commented Jul 25, 2018

Resolves #415.

This standardizes the behavior of primary keys for nodes in the app:

  • Each node imported from external data are given a unique identifier, unless it already has one
    • UIDs are stable across imports as long as input data is stable (UID == digest of contents)
    • This is done up front, at least for now (may need to change with Refactor externalData in protocol #413)
  • If external data defines PK values, they are assumed to be unique.
  • A protocol author may define a different key for each data source, using "nodes_primary_key". See the example of studentId in the dev protocol.
  • The convention (currently; see definition in modules/network) is to use the uid prop as an identifier. Nodes always have this property defined internally, regardless of the external data's primary key.
  • Search interface can now rely on UIDs for result display

Things to test

  • Add nodes from different sources, e.g.
    • previousInterview has existing uids
    • schoolPupils uses the same IDs, but uses a custom PK ("studentId")
    • HIVServices and venue_list do not define any primary keys
  • In the dev protocol, delete UIDs from "previousInterview". The first node generator should behave well (e.g., not contain an infinite number of the same node as you drag nodes out).
  • Changing the PK prop ("uid") used internally by the app shouldn't matter
  • Reloading the app should not change results shown in Search or other node generators
  • View generated data by downloading the export file at the end

Edit: this also fixes #605, as UIDs should override anything in additionalAttributes (which cannot be unique).

New nodes are no longer given an `id` prop. Edges in the network
refer to the primary key (uid).
Eventually, this will be user-configurable.

Actions and UI-specific meta continue to use `uid`.
Clarify use of key in CardList, and arbitrary ID key in SessionList.
Drag/drop "meta" is currently set directly from node props; these always
represent nodes, so can make use of the PK directly.
@@ -737,212 +742,213 @@
]
},
"schoolPupils": {
"nodes_primary_key": "studentId",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should we stick with the camel case convention?

@wwqrd
Copy link
Contributor

wwqrd commented Jul 27, 2018

I see that we have chosen to make separate uids the responsibility of the user, but I think it's too opaque.

If:

{ studentId: 1, primaryKey: 'studentId' } == { teacherId: 1, primaryKey: 'teacherId' }

Then the user simply won't see the other node once one is added to the network.

@bryfox
Copy link
Contributor Author

bryfox commented Jul 27, 2018

Good point; that would certainly be unexpected.

I thought the primary intent for specifying external PKs was to be able to identify objects that should be treated as equal between different data sets. In which case we do still want to support something like: { subjectId: 1, primaryKey: 'subjectId' } == { id: 1, primaryKey: 'id' }, where subjectId=1 and id=1 should be treated as the same.

Perhaps the API for describing this is confusing? Would something like this be better?

mergeDataOnPrimaryKeys: { previousInterview: "uid", students: "studentId" },
externalData: { previousInterview: {}, students: {} }

@bryfox
Copy link
Contributor Author

bryfox commented Jul 27, 2018

Also, based on your example, I think I should no longer default to any PK. i.e., we won't automatically assume "uid" is unique.

@rebeccamadsen
Copy link
Contributor

I think ExportData needs to use the primary key. It's looking for id right now and falling back to index because it's not there.

lines 149-150 should change to

    if (dataElement[NodePK]) {
     domElement.setAttribute('id', dataElement[NodePK]);

I haven't tried all the scenarios yet. I think probably the excludeList sent to generateKeys and addElements should include the NodePK value instead of id.

If I find anything else I'll comment again. So far it's working well though!

@wwqrd
Copy link
Contributor

wwqrd commented Jul 30, 2018

We could update the panels/comparison function to use an arbitrary property on the node? - That might even mean that PKs aren't needed?

Translate uid to id attribute in xml
@bryfox
Copy link
Contributor Author

bryfox commented Jul 30, 2018

@rebeccamadsen thanks, I pushed those fixes. I agree excludeList needs to include UID now; I added a specific test for that. I also removed 'id' from the edges, which is never generated by the app; that's the cause of the other test's snapshot diff.

@wwqrd what is the panels/comparison function?

@jthrilly
Copy link
Member

We could update the panels/comparison function to use an arbitrary property on the node? - That might even mean that PKs aren't needed?

PKs are definitely needed, particularly for external data which has a lot of functionality we haven't made yet.

I haven't tried all the scenarios yet. I think probably the excludeList sent to generateKeys and addElements should include the NodePK value instead of id.

This is reminding me very much of the discussion we had about private vs public properties. At the time (https://github.com/codaco/Network-Canvas/pull/443#issuecomment-381155177) I suggested putting all "private" nc properties inside something like _private on the node. This would include things like the uuid, the property for creation order, and any additional properties that are convenient for the app/interfaces but that we don't need to export and that aren't actual properties of the nodes in a graph theoretic sense. Can anyone think of a reason not to do this?

@bryfox
Copy link
Contributor Author

bryfox commented Jul 31, 2018

Based on a conversation with @jthrilly I've removed support for defining existing PKs in external data ("separate uids", above), and no longer assume that something labeled id or uid is a PK. We may add that convenience once we've better defined requirements. (Original use case was mentioned in #397).

When using the development protocol, you'll once again see seemingly-duplicated nodes between NG Closeness and NG Classmates.

I've also renamed the internal PK to _uid for now (and updated the Action props for clarity). Isolating model data can happen as part of related issues (#446, #587, #596).

@rebeccamadsen
Copy link
Contributor

These updates look okay to me.

@wwqrd
Copy link
Contributor

wwqrd commented Aug 1, 2018

RE:

@wwqrd what is the panels/comparison function?

In hindsight I think I may have been making the same point as @rebeccamadsen?

@jthrilly
Copy link
Member

jthrilly commented Aug 1, 2018

I created #620 to discuss the idea of _private on nodes and edges.

@bryfox bryfox merged commit 7388bc3 into master Aug 1, 2018
@bryfox bryfox deleted the feature/ensure-uids branch August 1, 2018 20:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
4 participants