-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
choice randomization: better approximation of JR behaviour, fixes #49 #241
base: main
Are you sure you want to change the base?
choice randomization: better approximation of JR behaviour, fixes #49 #241
Conversation
🦋 Changeset detectedLatest commit: 6628199 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
69be0c6
to
6432f97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great. I really appreciate how the commentary here tells the why story!
As discussed a bit in Slack, I think a couple of adjustments would make the change clearer. And would either make the need for some of the commentary moot, or make the remaining commentary more useful.
-
There's a clear "JavaRosa compatibility" responsibility here. While it is inherently coupled to the
seededRandomize
implementation, it is also very specifically a mapping to a more general concept:longValue
. I think it would help immensely for future understanding of what's going on here if we make that an explicit function, with the same name. -
In general, I've found liberal use of JSDoc comments (i.e.
/** ... */
) really helpful. The comment style provides support for all sorts of editor functionality.Here, we'd get a lot of benefit from inline linking support (i.e.
{@link $URL_OR_REFERENCE}
and/or{@link $URL_OR_REFERENCER | more specific title}
). In particular I think a permalink to the JavaRosaresolveRandomSeed
method would be useful, and probably also links to the pertinent issues. This ties directly to point 1: giving the JavaRosa-compatible thing a name corresponding to the Java thing it emulates, also gives a clear place for that JSDoc to reference it and clarify the nuances it's addressing.I think a few tweaks to this comment would be pretty much perfect.
-
We can eliminate the divergences from JavaRosa by using BigInt values for several of these cases. This, combined with their usage in a clearly articulated
longValue
JR-equivalent would also eliminate the need for commentary on those cases. In my local exploration of this, what I found made the most sense with the least fuss was to changetype Int = number
totype Int = bigint | number
, then have thelongValue
equivalent also produce thatInt
type. Pretty much everything else that would need to change falls out of that (i.e. any mixed-type operators producing fractional values do the appropriate explicitNumber()
casts to preserve those mathematical semantics).It'd also be useful for the
Infinity
/-Infinity
cases to be bound to constants with clear names. Insofar as there's still benefit to commentary on those, JSDoc on those constants is a good place.
Aside from making some of the intent clearer here, I suspect we may find there are other edge cases where we want to cordon off JavaRosa-compat/Java-isms in a general and reusable way. Even if that seems like a premature abstraction, doing it in this case is a direct, 1:1 linkable reference to the existing abstraction we'll be emulating.
Edit: oh, and this definitely feels like it deserves a changeset.
This needs updates for:
Drafting! |
706f380
to
159f06c
Compare
c435390
to
b7cce27
Compare
5fa9d75
to
9bb403c
Compare
9bb403c
to
c83fef2
Compare
Done! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! I think this is really close. Most of my remaining feedback is around code clarity (naming, separation of responsibilities, accessibility and applicability of comments).
function toBigIntHash(text: string): bigint { | ||
// hash text with sha256, and interpret the first 64 bits of output | ||
// (the first and second int32s ("words") of CryptoJS digest output) | ||
// as a BigInt. Thus the entropy of the hash is reduced to 64 bits, which | ||
// for some applications is sufficient. | ||
// The underlying representations are big-endian regardless of the endianness | ||
// of the machine this runs on, as is the equivalent JavaRosa implementation | ||
// at https://github.com/getodk/javarosa/blob/ab0e8f4da6ad8180ac7ede5bc939f3f261c16edf/src/main/java/org/javarosa/xpath/expr/XPathFuncExpr.java#L718-L726 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
function toBigIntHash(text: string): bigint { | |
// hash text with sha256, and interpret the first 64 bits of output | |
// (the first and second int32s ("words") of CryptoJS digest output) | |
// as a BigInt. Thus the entropy of the hash is reduced to 64 bits, which | |
// for some applications is sufficient. | |
// The underlying representations are big-endian regardless of the endianness | |
// of the machine this runs on, as is the equivalent JavaRosa implementation | |
// at https://github.com/getodk/javarosa/blob/ab0e8f4da6ad8180ac7ede5bc939f3f261c16edf/src/main/java/org/javarosa/xpath/expr/XPathFuncExpr.java#L718-L726 | |
/** | |
* Hash text with sha256, and interpret the first 64 bits of output (the first | |
* and second int32s ("words") of CryptoJS digest output) as a BigInt. Thus the | |
* entropy of the hash is reduced to 64 bits, which for some applications is | |
* sufficient. The underlying representations are big-endian regardless of the | |
* endianness of the machine this runs on, as is the | |
* {@link https://github.com/getodk/javarosa/blob/ab0e8f4da6ad8180ac7ede5bc939f3f261c16edf/src/main/java/org/javarosa/xpath/expr/XPathFuncExpr.java#L718-L726 | equivalent JavaRosa implementation}. | |
*/ | |
const toBigIntHash = (text: string): bigint => { |
As a JSDoc comment, this allows the same documentation to be accessed at the call site.
Switching to an arrow function is somewhat a nit, but it's generally preferable to avoid unnecessary function
functions as they have confusing behavior. (Maybe that's also a thing we could lint?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ea5c499 removes the function
keyword.
As for multiline comments: I don't like them. My editor is not supremely ergonomic with it, especially with the decorative *
in front of each line. Which, anyway, diminish the advantages of multiline comments — now one has to prefix each line with *
instead of //
, PLUS still manage the actual comment start and end markers - how is that a win over just plain simple //
line comments, I wonder?
Github is also not super smart with them, look at the "keyword" syntax highlighting it applied to the diff just above! So I don't like to use that comment style myself but if someone else does, they're welcome to ;-)
As for JSDoc links, I don't like them. They move the description of the link to after the link (cf. Markdown). So then to read what the link is doing there, what it's for, I first need to scan to the end of a long URL. The hypothetical usability gain is that if you have an IDE that is smart with specifically JSDoc comments, you can click the link? Copy-pasting isn't so bad and anyway most things — my editor, my terminal — already make http(s)-URLs clickable (or ctrl-clickable). Not worth the disruption of the natural text reading flow to me, but I won't complain if someone else makes {@link https://asdsadsoewofihewofbcewnco.ewewrewrewrewconlnzc.cwefewr.few.cpoqwjeansls | these kind of links}, I just won't emit them myself ;-)
Less is more!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll try to make the case for JSDoc. If you're open to reconsidering here, that would be excellent. If not, I think we should come back to this discussion as a team.
JSDoc comments are a standard designed to encode structured documentation about any symbol they're attached to.
The editor support alone goes well beyond linking to URLs. For example, the ability to reference documentation across modules are invaluable. Linking to other symbols (both within and across modules) is also invaluable, both as a navigation tool and because they can be kept up to date as those symbols change.
Beyond editor support, being a standard for structured documentation and association with symbols, JSDoc can be used for documentation output. We are already using this to generate documentation for @getodk/xforms-engine
. I'd quite like to expand that to other use cases.
I share your distaste for some of the syntax minutiae of JSDoc, and @link
is particularly weird (I suspect this is because inline tags are relatively rare). But that distaste doesn't outweigh the overwhelming benefits of an extensible documentation standard which is widely adopted in tooling we already use. It is also widely adopted throughout this project, and across the ecosystem; which is to say, it is both locally and globally idiomatic.
I also noticed GitHub's odd presentation in a couple diff suggestions in this PR. It's worth noting that:
- That's not representative of how GitHub presents JSDoc in complete source
- It's not representative of how GitHub presents JSDoc in diffs broadly
- GitHub's syntax highlighting is notoriously inconsistent across various views
- The highlighting is applied to an incomplete (i.e. syntactically invalid) chunk of code, which likely exacerbates potential issues
Lastly, I am sensitive to poor authoring ergonomics. I'm a bit surprised to hear that your editor doesn't make adding/editing JSDoc comments easier than single line comments, as that's my experience in the editors I'm familiar with. If this is a major hangup, I'd be happy to help look into ways to make the authoring experience nicer for you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll make the change hoping that the benefits will become apparent at some point 😆
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated!
return seededRandomize(nodes, seed); | ||
if (seedExpression === undefined) return seededRandomize(nodes); | ||
const seed = seedExpression.evaluate(context); | ||
const asNumber = seed.toNumber(); // TODO: There are some peculiarities to address: https://github.com/getodk/web-forms/issues/240 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure this comment belongs here. It isn't specific to this cast, it's specific to casting to XPath number
throughout. Fine to leave since we have an issue tracking it, but we'll probably just find it went stale some time after we address the issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is intended for someone reading the randomization code when trying to figure out why WF and JR still produce different sort orders. If it goes stale (when the issue is resolved) then following the link to the issue will make that apparent. I don't see a big problem.
let finalSeed: number | bigint | undefined; | ||
if (Number.isNaN(asNumber)) { | ||
// Specific behaviors for when a seed value is not interpretable as numeric. | ||
// We still want to derive a seed in those cases, see https://github.com/getodk/javarosa/issues/800 | ||
const seedString = seed.toString(); | ||
if (seedString === '') { | ||
finalSeed = 0; // special case: JR behaviour | ||
} else { | ||
// any other string, we'll convert to a number via a digest function | ||
finalSeed = toBigIntHash(seedString); | ||
} | ||
} else { | ||
finalSeed = asNumber; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't "special case: JR behavior" apply to all of this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not really. Some of the behaviour is in the odk spec. The "zero-length-string becomes 0" behaviour was surprising though.
Co-authored-by: eyelidlessness <[email protected]>
b36d0a1
to
cb6a021
Compare
Closes #49
I have verified this PR works in these browsers (latest versions):
Some related problems remain to be solved:
This brings what Webforms does more in line (barring #240) with what Javarosa does, and as such fixes the immediate problem of #49.
I felt it was worth it to be verbose with the comments here, so check those out.
This story is not over yet. Depending on whether we deem it OK to change the seed derivation algo, I'd like to make it value type/length agnostic and would just hash the input in its textual form and derive a seed from that hash - see getodk/javarosa#800. And in that case this code will need to be altered again.