style_guide ideas by KOLANICH #7

KOLANICH · 2017-10-18T00:08:31Z

In the wake of kaitai-io/kaitai_struct_formats#68

GreyCat · 2017-10-20T12:10:27Z

Uhh, there's a lot of stuff at once, and it's actually hard to cherry-pick them, as they're a single monolithic commit. I'll try to comment in-line.

GreyCat · 2017-10-20T12:11:48Z

ksy_style_guide.adoc

@@ -249,28 +248,36 @@ guesswork.

 * For simple non-repeated fields, use a simple singular form —
  i.e. `width`, `header`, `transaction_id`, `file`.
-* For repeated files (i.e. with `repeat: something`), use plular form
+* For a sequence (i.e. with `repeat: something`), use plular form


Good catch!

GreyCat · 2017-10-20T12:13:35Z

ksy_style_guide.adoc

+* Repeated fileds which cannot be packed into a sequence should
+  have `id`s containing a number in the end. Numbers may have a
+  visually-obvious structure, like "the first digit is major, the second
+  one is minor".


This pretty vague and can be interpreted in multiple ways. Probably an example would help. Right now, it's not clear (1) whether foo_1 or foo1 fit this, (2) how one starts the numbering and how it should progress, (3) what is "visually-obvious structure".

1 I use foo0
2,3 at KSY author's disposal. There are differrent situations and we want to be flexible.

Um, yet again, the whole point of having a standard is not to be flexible, but to be rigid and provide no-brains-needed solutions, which are also easy to code into machine checks.

the whole point of having a standard is not to be flexible, but to be rigid and provide no-brains-needed solutions, which are also easy to code into machine checks.

If we wanna some checks, IMHO it's better to extend KS language instead of introducing a convention making the names ugly.

These numbers are not for checks, but for developer's convenience.

GreyCat · 2017-10-20T12:24:16Z

ksy_style_guide.adoc

-  "magic values"), use `magic` name, or, if there are several of them,
-  `magic1`, `magic2`, etc.
+  "magic values"), use `signature` or `magic` name, or, if there are
+  several of them, like `signature` or `magic0`, `magic1`, etc.


Ok, so, basically:

You specify "signature or magic", and don't specify any way to choose between these two. Given that its essentially the same stuff, I see no point in separation this into two names, and adding headache for ksy developers to choose between two proposed alternatives.

You want every numbered thing to start counting from 0. Any reasons to do so, i.e. any major examples of someone doing that?

1 at KSY author's disposal. I personally prefer "signature" because it clearly states that this field is used for identification.
2 Just taste. If we start arrays from index 0 in the most of languages why not to follow the same convention here?

GreyCat · 2017-10-20T12:32:32Z

ksy_style_guide.adoc

+  particular, number of repetitions of some other structure), use either
+  `_count` suffix or `count_of_` (what sounds better in your opinion)
+  prefix and a plular form — i.e. `count_of_questions`, `blocks_count`,
+  `nodes_count`


Again, I don't think it's a good idea. The standard must be clear and avoid ambiguilities whenever it's possible. There is no point to make ksy developer to choose between several possible alternatives, especially using very subjective means (i.e. "sounds better"). The whole point of having a hard standard is to make it possible to use automated machinery to process stuff, and it's very hard to program a converter from num_* to *_count or count_of_*.

Also, both sound kinda quirky. Stuff like "blocks count", "nodes count" is plain wrong from English point of view. It should be "block count", "node count", etc, see this SE entry for examples. "Count of X" sounds like a title of some person for me (i.e. count of Champagne, count of Monte Cristo, count of Barcelona).

There is no point to make ksy developer to choose between several possible alternatives, especially using very subjective means (i.e. "sounds better").

No point, but I guess we want to have readable names.

The whole point of having a hard standard is to make it possible to use automated machinery to process stuff, and it's very hard to program a converter from num_* to count or count_of.

2 regexes with 1 capturing group each + one logical operation to recognize and parse such kind of names.

Also, both sound kinda quirky. Stuff like "blocks count", "nodes count" is plain wrong from English point of view. It should be "block count", "node count", etc, see this SE entry for examples.

You are right. But you can also think about this as about count property of blocks with a _ instead of ., since . is not allowed.

"Count of X" sounds like a title of some person for me (i.e. count of Champagne, count of Monte Cristo, count of Barcelona).

IMHO it's completely OK since there isn't many counts, dukes and baronets in variable names.

You're kind of contradicting yourself:

No point, but I guess we want to have readable names.
But you can also think about this as ...

Either you want "normal English" names, then you can't have "blocks count" and "count of blocks". If you allow some special naming, then "num_blocks" is about as readable as your versions. Hungarian apps notation recommends "nBlocks" for these purposes, and even that is understandable by majority of people.

I especially dislike num prefix because of its redundancy and ambiguity. It says that the field is number, but we see that it is a number from the type. It says nothing about the purpose of that number. cnt should be better here. And again, I'm against prefixes and unreadable tokens.

Hungarian notation

is evil.

GreyCat · 2017-10-20T12:47:26Z

ksy_style_guide.adoc

-  entry), `len_blocks` (total length of whole `blocks` array, made of
+  (in bytes or some other fixed units), use `_size` suffix and name of
+  that data structure — i.e. `block_size` (length of a single `block`
+  entry), `blocks_size` (total length of whole `blocks` array, made of


So, it boils down to:

Prefix vs suffix

Long of short spelling

In my opinion, "prefix" is better and definitely more widespread, i.e. as in majority of languages that use prefixes to differentiate parameters vs instance members, or Apps variety of Hungarian notation. Prefix does not try to sound like "proper English", and it avoids lots of ambiguilities and spelling problems (like that "count" stuff I've demonstrated above). For example, there are tricky nouns in English (like "water" or "money" or "news" or "advice" or "hair"), which make proper English phrasing difficult.

Last, but not least, I definitely prefer shorter form of spelling. Given that we have to make "one size fits all or most" identifiers, we're not going to have full-length Java-style identifiers like "NormalBucketDistributionProgressionOfLongIntegerConstantsForHadoopNodeFactory" — (1) there are languages that have certain ID length limits, (2) a vast majority people I've interviewed do not like such long identifiers very much, (3) it's actually hard to work with them without a IDE with expression autocompletion and stuff — and we don't have one (and I'm not very keen about having IDE as the only way to work with the language).

So, all and all, clear 3 character prefix looks like a good trade-off for me — it's non-ambiguous, it makes it very easy to detect whatever this attribute is about by always looking at the beginning, and it fits concise C-like style well.

In my opinion, "prefix" is better and definitely more widespread, i.e. as in majority of languages that use prefixes to differentiate parameters vs instance members, or Apps variety of Hungarian notation.

It doesn't mean that we should do this too.

The good part in postfix is that since we read left-to-right all the properties of the same member look similar to each other. Especially when aligned. But with prefix they look too differrent. Also take in mind the argument with the point.

(1) there are languages that have certain ID length limits,

Valid.

(2) a vast majority people I've interviewed do not like such long identifiers very much

Neither do I. But In the case of short enough identifiers, which is the majority of them, I prefer to have them readable. It is only a guide, not a strict requirement.

(3) it's actually hard to work with them without a IDE with expression autocompletion and stuff

Valid. BTW some text editors, like Notepad++, have autocompletion which autocompletes any tokens.

GreyCat · 2017-10-20T12:48:53Z

ksy_style_guide.adoc

  `block` entries).
+* Fields of unknown/undetermined purpose, i.e. unfinished reverse
+  engineering work SHOULD either NOT TO HAVE `id` or HAVE an
+  `id` matching the `/unkn(?:own)?(_\w+)?\d*/` regular expression.


This is, again, not how style guide should be written. It's pointless to leave so much choices to do for ksy developer. It's better to decide on this once and for all, and it will be easy for everyone — to remember, to follow, to understand, to write automated tools for that.

1 Having an id for a property of unknown meaning is useful when doing reverse-engineering, especially if you have more than one such a property in the same struct.
2 But if a developer doesn't need that id, why to force him to have it?

GreyCat · 2017-10-20T12:50:41Z

ksy_style_guide.adoc

+* In the case of multiple enums with the same name it's usually
+  better to append the integer value rather than a sequence
+  number to its `id`.
+


I had some problems understanding what you've meant here. Probably an example should help.

https://github.com/kaitai-io/kaitai_struct_formats/blob/master/scientific/nt_mdt/nt_mdt.ksy#L632L634

Yeah, that's a good idea. But we should describe it in more detail, with an example, and think of some other choices to be made here. For example, should it be unknown31 or unknown_31? unknown_31 or unknown_1f or unknown0x1f?

For example, should it be unknown31 or unknown_31? unknown_31 or unknown_1f or unknown0x1f?

No 0x, 0o and 0b prefixes, no underscore, the number part is the same as in the left part.

arekbulski · 2018-04-07T16:24:16Z

I will carefully review this PR and attempt to solve any existing issues.

Co-Authored-By: Arkadiusz Bulski <[email protected]>

This was referenced Oct 18, 2017

KSY style guide kaitai-io/kaitai_struct#140

Open

added xm kaitai-io/kaitai_struct_formats#68

Merged

GreyCat reviewed Oct 20, 2017

View reviewed changes

arekbulski mentioned this pull request Jan 22, 2018

Documentation updates remain uncommited kaitai-io/kaitai_struct#323

Closed

2 tasks

arekbulski self-assigned this Apr 7, 2018

KOLANICH mentioned this pull request Oct 23, 2018

Add bitcoin transaction kaitai-io/kaitai_struct_formats#102

Merged

GreyCat mentioned this pull request Jun 17, 2019

Add PHP serialized value and phar archive format kaitai-io/kaitai_struct_formats#173

Merged

style_guide ideas by KOLANICH

00ed64a

Co-Authored-By: Arkadiusz Bulski <[email protected]>

KOLANICH force-pushed the style_guide_ideas branch from 1834ea9 to 00ed64a Compare January 25, 2021 21:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

style_guide ideas by KOLANICH #7

style_guide ideas by KOLANICH #7

KOLANICH commented Oct 18, 2017 •

edited

Loading

GreyCat commented Oct 20, 2017

GreyCat Oct 20, 2017

GreyCat Oct 20, 2017

KOLANICH Oct 22, 2017 •

edited

Loading

GreyCat Oct 23, 2017

KOLANICH Oct 23, 2017 •

edited

Loading

GreyCat Oct 20, 2017

KOLANICH Oct 22, 2017

GreyCat Oct 20, 2017

KOLANICH Oct 22, 2017 •

edited

Loading

GreyCat Oct 23, 2017

KOLANICH Oct 23, 2017 •

edited

Loading

GreyCat Oct 20, 2017

KOLANICH Oct 22, 2017 •

edited

Loading

GreyCat Oct 20, 2017

KOLANICH Oct 22, 2017

GreyCat Oct 20, 2017

KOLANICH Oct 22, 2017 •

edited

Loading

GreyCat Oct 23, 2017

KOLANICH Oct 23, 2017

arekbulski commented Apr 7, 2018

style_guide ideas by KOLANICH #7

Are you sure you want to change the base?

style_guide ideas by KOLANICH #7

Conversation

KOLANICH commented Oct 18, 2017 • edited Loading

GreyCat commented Oct 20, 2017

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KOLANICH Oct 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KOLANICH Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KOLANICH Oct 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KOLANICH Oct 23, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KOLANICH Oct 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

KOLANICH Oct 22, 2017 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

arekbulski commented Apr 7, 2018

KOLANICH commented Oct 18, 2017 •

edited

Loading

KOLANICH Oct 22, 2017 •

edited

Loading

KOLANICH Oct 23, 2017 •

edited

Loading

KOLANICH Oct 22, 2017 •

edited

Loading

KOLANICH Oct 23, 2017 •

edited

Loading

KOLANICH Oct 22, 2017 •

edited

Loading

KOLANICH Oct 22, 2017 •

edited

Loading