rsonpath
-specific behavior
We try to implement the JSONPath spec
as closely as possible. There are currently two major differences between
rsonpath
’s JSONPath and the standard.
Nested descendant segments
The standard semantics of the descendant segment lead to duplicated results,
and a potentially exponential blowup in execution time and output size.
In rsonpath
we diverge from the spec to guarantee unduplicated results:
$ rq '$..a..a' --json '{ "a": { "a": { "a": 42 } } }'
{ "a": 42 }
42
In standard semantics the value 42
would be matched twice1.
Unicode
Currently rsonpath
compares JSON keys bytewise, meaning that labels using
Unicode escape sequences will be handled incorrectly.
For example, the key "a"
can be equivalently represented by
"\u0041"
.
$ rq '$["a"]' --json '{ "a": 42 }'
42
The above results should be the same if either or both of the "a"
characters
were replaced with the "\u0041"
unicode escape sequence, but rsonpath
does
not support it at this time. This limitation is known and tracked at
#117.
The reason behind this is a bit subtle. The standard
defines the result as a concatenation of lists of results of executing
the rest of the query after a descendant segment, and a recursive execution
of the entire query. So the inner "a"
key in the example is matched first
when evaluating the outermost one, and then again when evaluating the middle "a"
.
We consider this to be counter-intuitive and undesirable.