-
Notifications
You must be signed in to change notification settings - Fork 1.6k
How to: Avoid Pitfalls
TOC
- Keywords
nan
,NaN
,inf
,Inf
,infinite
and nullfoo.bar
vs.foo.bar
- Cartesian Products
- Generator Expressions in Assignment Right-Hand Sides
- Backtracking (
empty
) in Assignment RHS Expressions and Reductions - Multi-arity Functions and Comma/Semi-colon Confusability
index/1
is byte-oriented butmatch/1
is codepoint-oriented- If A and B are arrays,
B|.[A]
is the same asB|indices(A)
- Overriding Operator Definitions
The fact that jq has keywords such as if
and end
has various implications, some of which may not be obvious. In particular:
- in jq 1.6 and earlier, keywords cannot be used in the abbreviated syntax for specifying key-value pairs, e.g.
{foo}
for{"foo": .foo}
- in jq 1.6 and earlier, keywords cannot be used to form $-variable names
The full list of reserved keywords is currently:
and
as
break
catch
def
elif
else
end
foreach
if
import
include
label
module
or
reduce
then
try
(The list of keywords for any particular version of jq can be derived from the lexer.l file, the “master” version of which is https://github.com/stedolan/jq/blob/master/src/lexer.l)
nan
is a jq value representing IEEE NaN, but it prints as null
.
NaN
is recognized in JSON text and is also understood to represent IEEE NaN.
Use isnan
to test whether a jq value is identical to IEEE NaN.
Here are some illustrative examples:
$ echo NaN | jq .
null
$ echo nan | jq .
parse error: Invalid literal at line 2, column 0
$ echo NaN | jq isnan
true
$ jq -n 'nan | isnan'
true
Similar comments apply to the jq value infinite
, and the admissible values inf
and Inf
:
$ echo Inf | jq isinfinite
true
$ echo inf | jq isinfinite
true
$ jq -n 'infinite | isinfinite'
true
foo.bar
is short for foo | .bar
and means: call foo
and then get the value at the "bar"
key of the output(s) of foo
.
.foo.bar
is short for .foo | .bar
and means: get the value at the "foo"
key of .
and then get the value at the "bar"
key of that.
One character, big difference.
jq is geared to produce Cartesian products at the drop of a hat. For example, the expression (1,2) | (3,4)
produces four results:
3
4
3
4
To see why:
$ jq -n '(1,2) as $i | (3,4) | "\($i),\(.)"'
"1,3"
"1,4"
"2,3"
"2,4"
Generator expressions in assignment RHS expressions are likely to surprise users. Compare (.a,.b) = (1,2)
to (.a,.b) |= (.+1,.*2)
.
.a=empty
and .a|=empty
behave differently:
null | .a = empty #=> the empty stream
null | .a |= empty #=> null
In reductions, care should be exercised when including empty
in the body. For example, one might reasonably expect that:
reduce 1 as $x (2; empty)
would produce 2
, but in fact it produces null
in most versions of jq, including jq 1.5 and earlier, as well as the current “master” version as of 2018.
WARNING: Expressions of the form A | .[] |= E
where A is an array and E can evaluate to empty
should in general be avoided. Their behavior is inconsistent between versions of jq, and jq version 1.6 will often evaluate them incorrectly. For example, using jq 1.6:
jq -n '[0,1,2] | .[] |= if . == 0 then empty else . end'
yields:
[1,2,null]
foo(a,b)
is NOT the same as foo(a;b)
. If foo/1
and foo/2
are both defined, then if you write foo(a,b)
intending to call the two-argument function, you'll silently get the wrong behavior.
For example, foo(1,2)
is a call to foo/1
with a single argument consisting of the expression 1,2
, while foo(1;2)
is a call to foo/2
with two arguments: the expressions 1
, and 2
.
One character, big difference.
Given strings as input, the index
family of filters (index
, rindex
,
indices
) return byte-oriented offsets. For codepoint-oriented
offsets, one can use the array-oriented versions of these filters, or match/1
or match/2
, or the definition of myindex
given below.
For example:
$ jq -cn '"aéb" | [., index("b")]'
["aéb",3]
$ jq -cn '"aéb" | [., (explode|index("b"|explode))]'
["aéb",2]
$ jq -cn '"a\u00e9b" | [., index("b")]'
["aéb",3]
$ jq -cn '"a\u00e9b" | match("b").offset'
2
# codepoint-oriented version of `index/1` for strings
# e.g. ("”#a" | myindex("#a")) yields 1
def myindex($string):
($string|length) as $sl
| if $sl > length
then null
else
explode as $x
| ($string|explode) as $s
| first(range(0; 1 + length - $sl) as $i
| select($x[$i: $sl+$i] == $s) | $i) // null
end;
If A and B are JSON arrays, then B|.[A]
asks for the sorted array of ALL the indices, $i, such that .[0:$i] + A is an initial subarray of B. This has implications for B|index(A)
as well.
Examples:
jq -nc '[0,1,2,3,4,1,2] | .[[1,2]]'
[1,5]
jq -nc '[0,1,2,3,4,1,2] | index([[1,2]])
1
Overriding operator definitions is possible but probably ill-advised if for no other reason
than that the results can be surprising because of compile-time constant-folding.
Consider, for example, what happens when we override +
as follows:
def myplus($a;$b): _plus($a;$b);
def _plus($a;$b): [ myplus($a;$b) ];
We might expect that the expression 1+2
would now evaluate to [3] but, because the constant-folding
occurs before the new definition becomes effective, it will instead evaluate to 3
.
- Home
- FAQ
- jq Language Description
- Cookbook
- Modules
- Parsing Expression Grammars
- Docs for Oniguruma Regular Expressions (RE.txt)
- Advanced Topics
- Guide for Contributors
- How To
- C API
- jq Internals
- Tips
- Development