This is part 2 of a series of posts on Using HTTP APIs on the command line:

What is jq?

jq is a command-line JSON stream editor. It’s like awk for JSON.

It is one of my favorite command line tools, it is great at one thing, parsing JSON and composes well with other command line utilities (curl, cat, awk, etc).

It has an excellent manual on the web (most of which is also in the man jq page).

There is also an online “playground” called jqplay where you can try out jq input with expressions to see what the output would be without installing.

Installation

Quick install instructions:

OSX, use homebrew

brew install jq

Debian/Ubuntu Linux:

sudo apt-get install jq

Fedora:

sudo dnf install jq

Full instructions, including downloadable binaries for all platforms, are on the jq site.

jq basics

simple value extraction

jq has a concise mini-programming language for selecting and transforming JSON. The most basic expression is the . path selector, which simply selects the root of the current document. By itself it will work as a pretty-printer:

echo '{"foo": "bar", "baz": [1, 2, 3]}' | jq '.'
{
  "foo": "bar",
  "baz": [
    1,
    2,
    3
  ]
}

You can add the name of a field to narrow what jq will select:

echo '{"foo": "bar", "baz": [1, 2, 3]}' | jq '.foo'
"bar"

A nested value can be accessed with dot-notation, just like many programming languages:

echo '{"a": {"b": {"c": {"d": "value"}}}}' | jq '.a.b.c.d'
"value"

Output Options

Compact Output (reverse-pretty printing) with -c

This example uses a heredoc to give jq pretty-printed JSON and having it compact it to a single line:

input:

cat <<EOF | jq -c '.'
{
  "foo": "bar",
  "baz": [
    1,
    2,
    3
  ]
}
EOF

output:

{"foo":"bar","baz":[1,2,3]}

Raw String Output with -r

By default, if the result is a string, jq will emit it with double-quotes around it:

echo '{"foo": "bar", "baz": [1, 2, 3]}' | jq '.foo'
"bar"

If you want those quotes stripped off (maybe for piping to another command), you can use -r for raw string output:

echo '{"foo": "bar", "baz": [1, 2, 3]}' | jq -r '.foo'
bar

Output Sorted JSON Keys with -S

If you want the JSON that is emitted to have sorted keys (maybe for diffing/comparing to other output), you

echo '{"z": "fourth", "a": "first", "y": "third", "b": "second"}' | jq -S '.'
{
  "a": "first",
  "b": "second",
  "y": "third",
  "z": "fourth"
}

Formatting/Constructing Output

jq’s selectors (like . and .foo) will emit whatever matches their expression, but those selectors can be surrounded with valid JSON to create remixed JSON output.

Creating JSON Objects

You can also create remixed JSON objects from your input. Here we are moving the fields in the user object to the top-level:

cat <<EOF | jq '{first: .user.first, last: .user.last, username: .username}'
{
  "user": {
    "first": "Ted",
    "last": "Naleid"
  },
  "username": "tnaleid"
}
EOF
{
  "first": "Ted",
  "last": "Naleid",
  "username": "tnaleid"
}

Creating Arrays

if you want to turn a couple of JSON object values into an array, just use [] in your expression:

echo '{"first": "Ted", "last": "Naleid"}' | jq -c '[.first]'
["Ted"]

You can use multiple selectors in your expression to have multiple values in your output:

echo '{"first": "Ted", "last": "Naleid"}' | jq -c '[.first, .last]'
["Ted","Naleid"]

Other Selectors

Array Selectors

Each element in an array can be emitted with [], this will “unwrap” the array:

echo '[1, 2, 3]' | jq '.[]'
1
2
3

It can also be done on arrays that are nested:

echo '{"ids":[1, 2, 3]}' | jq '.ids[]'
1
2
3

If you know you want a particular index or slice of an array, you can pass values to the [].

To get the first element out of an array pass a 0 (it is zero-based):

echo '[1, 2, 3]' | jq '.[0]'
1

Negative offsets can be used to get elements from the end of an array whose length you are unsure of:

echo '[1, 2, 3, 4, 5, 6, 7]' | jq '.[-1]'
7
echo '[1, 2, 3, 4, 5, 6, 7]' | jq '.[-3]'
5

Array slices can be created by giving the start and end element you want, here we get the “tail” of the array by getting the 2nd element (1 as it is zero-based) to the last element in the array:

echo '[1, 2, 3, 4, 5, 6, 7]' | jq -c '.[1:-1]'
[2,3,4,5,6]

If you want specific values, you can specify those with comma separated indexes:

echo '[1, 2, 3, 4, 5, 6, 7]' | jq '.[1, 3, 5]'
2
4
6

keys Selector

If you want to a list all of the keys in an object, use the keys selector:

echo '{"z": "fourth", "a": "first", "y": "third", "b": "second"}' | jq -c 'keys'
["a","b","y","z"]

Composing Filters with |, , and ()

Just like many shells, jq has a “pipe” mechanism that allows the results of a selector to be piped to another jq function. Every result from the expression on the left of the | is fed into the expression on the right.

The expression .a.b.c is equivalent to the expression .a | .b | .c:

echo '{"a": {"b": {"c": "value"}}}' | jq '.a.b.c'
"value"
echo '{"a": {"b": {"c": "value"}}}' | jq '.a | .b | .c'
"value"

A , is used to separate filters, and the results so far will be composed with each filter:

echo '{"a": {"b": {"c1": "value", "c2": "other"}}}' | jq '.a | .b | .c1, .c2'
"value"
"other"

Parenthesis can be used to group filters together:

echo '{"a1": {"b": {"c": "value"}}, "a2": "other"}' | jq '(.a1 | .b | .c), .a2'
"value"
"other"

using select to filter results

If you only want particular values, you can give a boolean expression to select and it will return only the matches:

echo '[1, 2, 3, 4]' | jq '.[] | select(. > 2)'
3
4

It returns the full matching object that it was given:

echo '[{"a": 1}, {"a": 2}, {"a": 3}, {"a": 4}]' | jq -c '.[] | select(.a > 2)'
{"a":3}
{"a":4}

This can be useful if the thing you’re testing isn’t the actual result you want:

echo '[{"a": 1, "b": "bad"}, {"a": 4, "b": "good"}]' | jq -r '.[] | select(.a > 2) | .b'
good

Using ? for values that might not exist

Some expressions will work fine with some input:

echo '{"ids": [{"id": 123}, {"id": 456}]}' | jq '.ids[].id'
123
456

But will blow up if the chain of values doesn’t exist:

echo '{}' | jq '.ids[].id'
jq: error (at <stdin>:1): Cannot iterate over null (null)

This can be gotten around using the ? to tell jq that the value might be null and to ignore it:

echo '{}' | jq '.ids[]?.id'

Values with Varying Types

Sometimes, when an API has a breaking change, a value that was a string has become an integer, if you want to ensure that you only find things that are integers, use the type in a select:

echo '[{"id": 123}, {"id": "456"}]' | jq '.[].id | select(type == "string")'
"456"
echo '[{"id": 123}, {"id": "456"}]' | jq '.[].id | select(type == "number")'
123

You can use | pipes inside the select as part of your boolean expression. This will modify what select evaluates, but will not modify what gets selected:

echo '[{"id": 123}, {"id": "456"}]' | jq '.[] | select(.id | type == "number")'
{
  "id": 123
}

Valid types are number, boolean, array, object, null, string. There are also shortcut versions of these (and others) using the plural of the type:

echo '[{"id": 123}, {"id": "456"}]' | jq '.[].id | numbers'
123

.. Recursive Decent Value Selector

The .. selector will recursively descend down the currently selection and emit each:

echo '{"a": {"b": {"c": "value"}}}' | jq '.a | ..'
{
  "b": {
    "c": "value"
  }
}
{
  "c": "value"
}
"value"

This is most useful for combining with the select which can be used to filter output and search for matching structures:

cat <<EOF | jq -c '.. | select(.target? != null)'
{
  "object": {
     "target": "fromobject"
  },
  "values": [
    {
      "target": "fromarray"
    },
    {
      "ignored": "omitted"
    }
  ]
}
EOF
{"target":"fromobject"}
{"target":"fromarray"}

Create JSON from stdin that isn’t JSON

use -R to do raw input and then you can do this:

seq 3 | jq -R '{id: ., body: ({id: .} | @json)}'
{
  "id": "1",
  "body": "{\"id\":\"1\"}"
}
{
  "id": "2",
  "body": "{\"id\":\"2\"}"
}
{
  "id": "3",
  "body": "{\"id\":\"3\"}"
}

For more advanced versions where you want to potentially split up the arguments, you can use -Rn and input/inputs

seq 3 | jq -Rn 'inputs | {id: .}'
{
  "id": "1"
}
{
  "id": "2"
}
{
  "id": "3"
}

Working with Dates

An ISO-8601 date string can be turned into a unix time:

echo '{"created_date_time":"2017-08-17T19:02:51Z"}' | jq '.created_date_time | fromdate'
1502996571

given a stream of events that have created_date_time field like this:

{"id":"169","created_date_time":"2016-01-01T00:00:00Z"}
{"id":"170","created_date_time":"2017-08-17T19:02:51Z"}
{"id":"433","created_date_time":"2018-01-01T00:00:00Z"}

If we want to find all the ones that are less than or equal to “2017-08-17T19:02:51Z”, we can use this:

cat <<EOF | jq -c 'select((.created_date_time | fromdate) <= ("2017-08-17T19:02:51Z" | fromdate))'
{"id":"169","created_date_time":"2016-01-01T00:00:00Z"}
{"id":"170","created_date_time":"2017-08-17T19:02:51Z"}
{"id":"433","created_date_time":"2018-01-01T00:00:00Z"}
EOF
{"id":"169","created_date_time":"2016-01-01T00:00:00Z"}
{"id":"170","created_date_time":"2017-08-17T19:02:51Z"}

You can also use variables in jq like this to reuse a value:

cat <<EOF | jq -c '(.created_date_time | fromdate) as $created | select($created <= 1512996571 and $created > 1500000000)'
{"id":"169","created_date_time":"2016-01-01T00:00:00Z"}
{"id":"170","created_date_time":"2017-08-17T19:02:51Z"}
{"id":"433","created_date_time":"2018-01-01T00:00:00Z"}
EOF
{"id":"170","created_date_time":"2017-08-17T19:02:51Z"}

Debugging complex expressions with debug

Using debug in your pipeline expression will emit what the current value of something is and can be useful for figuring out what is causing errors:

This example shows us why the select that is looking for numbers we are picking 123 but not "456":

echo '[{"id": 123}, {"id": "456"}]' | jq '.[] | select(.id | debug | type == "number")'
["DEBUG:",123]
{
  "id": 123
}
["DEBUG:","456"]

Output in CSV/TSV format

It can be useful to emit values other than JSON. The first step is to emit the results as an array:

cat <<EOF | jq '[.id, .value]'
{"id":"169","value": 1}
{"id":"170","value": 2}
{"id":"433","value": 3}
EOF
[
  "169",
  1
]
[
  "170",
  2
]
[
  "433",
  3
]

Then, pipe that array either through the @tsv or @csv formatter to get valid output (also use -r to get the “raw” output):

cat <<EOF | jq -r '[.id, .value] | @csv'
{"id":"169","value": 1}
{"id":"170","value": 2}
{"id":"433","value": 3}
EOF
"169",1
"170",2
"433",3
cat <<EOF | jq -r '[.id, .value] | @tsv'
{"id":"169","value": 1}
{"id":"170","value": 2}
{"id":"433","value": 3}
EOF
169	1
170	2
433	3

I’ve often turned jq output into tab-separated output for later processing with other unix command tools like awk, sort, uniq etc

Working with Objects That Want to be Arrays

Sometimes, you’re given an object that has keys and values that you want to work with as if they are an array of key/value pairs, the to_entries operator can help:

cat <<EOF | jq -r '.states | to_entries[] | [.key, .value.at] | @tsv'
{
  "states": {
    "running": {
      "at": "2017-12-14T18:08:11Z"
    },
    "stopped": {
      "at": "2018-01-01T00:08:11Z"
    }
  }
}
EOF
running	2017-12-14T18:08:11Z
stopped	2018-01-01T00:08:11Z

Further Resources

There are many features and uses of jq that I haven’t documented here, I’ve only scratched the surface of what it can do. Here are some further links for reading:

Using jq across many http requests

If you want to see how you can map over many http requests and process the results with jq check out the next blog post in this series.