150bis revised proposal for fn:ranks #1062

michaelhkay · 2024-03-05T12:05:50Z

This proposal is an amended/alternative proposal for the fn:ranks function, taking into account the work done on the original issue #150 and the PR #1027 and the comments raised. Acknowledgements to the original author for the idea and for a lot of good work on examples etc.

It amends the previous proposal as follows:

(a) the signature and the semantics are aligned with fn:sort. This adds some functionality (multiple sort keys, ascending/descending) and also removes some complexity (two different collations for comparing input items and result items)

(b) the style of exposition is changed editorially for consistency with other functions

… separate ranking

…ise.

dnovatchev · 2024-03-05T17:27:42Z

This proposal is an amended/alternative proposal for the fn:ranks function, taking into account the work done on the original issue #150 and the PR #1027 and the comments raised. Acknowledgements to the original author for the idea and for a lot of good work on examples etc.

It amends the previous proposal as follows:

(a) the signature and the semantics are aligned with fn:sort. This adds some functionality (multiple sort keys, ascending/descending) and also removes some complexity (two different collations for comparing input items and result items)

(b) the style of exposition is changed editorially for consistency with other functions

I find both of these "amends" a step backwards from the original proposal.

What was intentionally pretty simple to understand, and call, function - is now loaded with huge and unnecessary luggage, like multiple key() functions and multiple collations.

Despite this awful increase in complexity, some functionality present in the original fn:ranks proposal has been eliminated - users do want to have distinct items within a rank - and this is not supported in this "amended" proposal.

For me the original fn:ranks and the function proposed here are two entirely different functions, that should not be mixed together. I firmly believe that it is better to go with the simpler one - easier to use and understand and still giving the user the option of getting distinct results in the ranks, which the new proposal eliminated completely.

If someone wants to suppress any good proposal coming from other people, then why we have this so called "community group" at all?

michaelhkay · 2024-03-05T17:45:11Z

I think the proposal is simpler than yours because almost all the semantics are copied and reused from fn:sort: it adds capability while reducing the number of words in the spec, and that's generally a good indicator.

I took out the functionality involving multiple collations and different comparisons for ordering and equality comparison because after strenuous efforts, I simply couldn't find a good way to take advantage of it.

Don't take it personally to have your ideas taken in a different direction by other people. It happens to me all the time, and the result is usually better than what I would have achieved on my own. That's precisely why we have a community group, to throw ideas around until a consensus crystalises. It's always good when a requirement gets creative input and alternative proposed solutions from different people.

dnovatchev · 2024-03-05T19:32:24Z

Don't take it personally to have your ideas taken in a different direction by other people. It happens to me all the time, and the result is usually better than what I would have achieved on my own. That's precisely why we have a community group, to throw ideas around until a consensus crystalises. It's always good when a requirement gets creative input and alternative proposed solutions from different people.

I am not "taking it personally", but I have a few questions:

Why should people who want to have distinct items within a rank be prevented from specifying this on a function call?
Why complicate the reader's understanding of this function (difficulties in understanding may result in the user's decision not to use the function at all) with:

sequence of many key functions vs. a single function
sequence of many collations vs. a single collation
multiple directions of ordering vs. specifying this within the key function
unnatural order of the arguments, forcing the user always to provide on a call at least one collation (even if there is no need for a collation) before being able to specify the key function.

These are the changes that I see in the new proposal and these are the immediate questions that they raise. In case these issues are eliminated, I would have nothing against the proposal - this is a question of providing the best and simplest to use functionality - not "taking it personally".

michaelhkay · 2024-03-05T20:10:05Z

In résponse to your questions:

Why should people who want to have distinct items within a rank be prevented from specifying this on a function call?

It may well be my limited understanding, but I haven't seen use cases that explain to me why this functionality is needed. It wasn't clearly identified in the original issue #150, and seems to have crept in later through a kind of "feature creep". Your proposal has two examples that use this functionality, but they only provide input and output, no motivating statement of requirements. In the original exposition of the issue, you pointed to functions available in MS SQL and in particular to the DENSE_RANK() function. As far as I can see the capability delivered in my version of fn:ranks is functionally equivalent to what DENSE_RANK() does. But I have no experience with MS SQL, and the web page you pointed to is not well written, so I may well have missed something.

Why complicate the reader's understanding of this function with...:

My belief is that the similarity of the function to fn:sort will make it easier for users to grasp how to use the function, rather than making it more difficult. There is less to learn because their existing knowledge and experience is transferable.

In addition, we discovered from user experience that the 3.1 version of fn:sort lacked features that users needed; I don't see any obvious reason why they should be needed in fn:sort and not in fn:ranks. I see that the MS SQL DENSE_RANK() function does indeed allow multiple sort keys, with the option for any of them to be descending.

I also see (on some closer reading) that DENSE_RANK() allows separate keys for partitioning, and for sorting within a partition. As far as I can see, such use cases can be satisified by using fn:partition and fn:sort in combination. But if you think there is a requirement that cannot be satisfied this way, then please share it, so we can work on it.

dnovatchev · 2024-03-05T21:13:05Z

In résponse to your questions:

Why should people who want to have distinct items within a rank be prevented from specifying this on a function call?

It may well be my limited understanding, but I haven't seen use cases that explain to me why this functionality is needed. It wasn't clearly identified in the original issue #150, and seems to have crept in later through a kind of "feature creep". Your proposal has two examples that use this functionality, but they only provide input and output, no motivating statement of requirements. In the original exposition of the issue, you pointed to functions available in MS SQL and in particular to the DENSE_RANK() function. As far as I can see the capability delivered in my version of fn:ranks is functionally equivalent to what DENSE_RANK() does. But I have no experience with MS SQL, and the web page you pointed to is not well written, so I may well have missed something.

In SQL everything is a set. Thus every record (row) returned from a query is distinct from all others, so there is no need for further disambiguation. On the other side we still don't have the set datatype in XPath and this is why we still need to rely on functions like fn:distinct-values and kill many trees by having to repeat in the rules of most of our functions that "the order of the results is implementation-defined" - which is totally eliminated when we have and use the set type.

Why complicate the reader's understanding of this function with...:

My belief is that the similarity of the function to fn:sort will make it easier for users to grasp how to use the function, rather than making it more difficult. There is less to learn because their existing knowledge and experience is transferable.

I quite disagree here. If the sort function is rather incomprehensible and difficult to grasp, and we base the ranks function on it, then the ranks function would be equally incomprehensible and difficult to grasp.

In addition, we discovered from user experience that the 3.1 version of fn:sort lacked features that users needed; I don't see any obvious reason why they should be needed in fn:sort and not in fn:ranks. I see that the MS SQL DENSE_RANK() function does indeed allow multiple sort keys, with the option for any of them to be descending.

The proposed function ranks is not a blind copy of any of the MS SQL ranking functions. It is rather an intentional attempt to provide the maximum simple and easiest to use such function to the user.

I also see (on some closer reading) that DENSE_RANK() allows separate keys for partitioning, and for sorting within a partition. As far as I can see, such use cases can be satisified by using fn:partition and fn:sort in combination. But if you think there is a requirement that cannot be satisfied this way, then please share it, so we can work on it.

Not a requirement that cannot be satisfied - just that all this becomes too-complicated for a sane, everyday developer.

The main idea in fn:ranks is that in its simplest form it will (in the most frequent case) be called specifying just two arguments: the $input and a single $key function. If a user wants to do something more sophisticated, they can always specify this within the key function.

dnovatchev and others added 25 commits February 17, 2024 14:18

fn:ranks

ebbd656

fixed the result of Example 2

6d2f47a

Added Example 3 - World Cup Group C final standings

4a9ffc7

fixed the definition of Example 3 (3 points for a win)

5515804

Example - rplaced a period with a comma

2d991ca

fix in Example 3 removed the $ sign at the start of ranks.

1db1358

Reflected Michael Kay's comments

55ed089

Expanded the Summary

87f9d3f

Reflected Michael Kay's comments and added a second collation and a note

7d61d7c

Removed the word singleton

8756a29

Remuved a comma from the commented rreturn

a574c57

Added Example 4 - Synonyms translated to the same Swedish word form a…

668767b

… separate ranking

Fixed Example 4

1450e5d

Expanded the rules with a non-formal definition

393f9a6

Refined the rules

69ceb1c

Now using fn:compare to form each array-result

23b7f06

Refined the Notes

6fa30c3

compare() is now calles with $collation-key

c651d7e

Made more precise formal definition. Added 2 more examples

1f61d49

Eliminated incorrect use of $colation-input. Made the rules more prec…

1f61de8

…ise.

Minot lexical-spatial adjustments

c0f18da

Minor lexical-spatial adjustments - 2

d126ff0

Formal definition refactoring

66513d9

Merge branch 'qt4cg:master' into master

7c7eb25

150 Revised proposal for fn:ranks

3b7fbab

michaelhkay mentioned this pull request Mar 5, 2024

150 fn:ranks #1027

Closed

Fix typo in example

9ee4c59

ChristianGruen added the Blocked PR is blocked (has merge conflicts, doesn't format, etc.) label May 21, 2024

ChristianGruen changed the title ~~150bis - revised proposal for fn:ranks~~ 150bis revised proposal for fn:ranks Sep 14, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

150bis revised proposal for fn:ranks #1062

150bis revised proposal for fn:ranks #1062

michaelhkay commented Mar 5, 2024

dnovatchev commented Mar 5, 2024

michaelhkay commented Mar 5, 2024 •

edited

Loading

dnovatchev commented Mar 5, 2024 •

edited

Loading

michaelhkay commented Mar 5, 2024 •

edited

Loading

dnovatchev commented Mar 5, 2024

150bis revised proposal for fn:ranks #1062

Are you sure you want to change the base?

150bis revised proposal for fn:ranks #1062

Conversation

michaelhkay commented Mar 5, 2024

dnovatchev commented Mar 5, 2024

michaelhkay commented Mar 5, 2024 • edited Loading

dnovatchev commented Mar 5, 2024 • edited Loading

michaelhkay commented Mar 5, 2024 • edited Loading

dnovatchev commented Mar 5, 2024

michaelhkay commented Mar 5, 2024 •

edited

Loading

dnovatchev commented Mar 5, 2024 •

edited

Loading

michaelhkay commented Mar 5, 2024 •

edited

Loading