How to handle JSON like a boss with jq JSON
JSON data is ubiquitous, constantly flowing between web services. But when you have a largish blob of the stuff how do you inspect it’s structure or quickly extract the piece you need?
Enter the handy little utility jq. jq
is a tool to query, filter, reshape, and otherwise be your JSON swiss army knife.
Lets dive into how it works. As our example JSON data we’ll be using the list of IP addresses for AWS services that is published here.
Here is a sample from the head of the ip-ranges.json
file:
{ "syncToken": "1563369545", "createDate": "2019-07-17-13-19-05", "prefixes": [ { "ip_prefix": "18.208.0.0/13", "region": "us-east-1", "service": "AMAZON" }, { "ip_prefix": "52.95.245.0/24", "region": "us-east-1", "service": "AMAZON" },
One of the simplest things we can do with jq
is access a property using the dot .
operator. Lets get the creation date of the file:
jq .createDate ip-ranges.json
Another handy feature is that .
pretty prints it’s output, and can be used alone to prettify a JSON file.
Lets try something more fun, and more “query”-like, lets create a list of all AWS regions:
jq '[.prefixes[].region, .ipv6_prefixes[].region] | unique' ip-ranges.json
Lets break that down. The .prefixes
makes sense, that is another property access like in the first sample. .prefixes
is an array of objects, which we then iterate over with []
which should remind one of JSON’s own array syntax. For each object in the array we then pull out the .region
key. Then things get cooler. We have two arrays that we would like to combine, which can be done with the comma ,
operator. We then end up with a list, which can be converted back into an array by surrounding the entire query with another []
.
Notice the placement of the single quotes in the statement above, we are not taking the output of jq
and then using a unix pipe to sent it to uniq
, rather jq
includes a built in idea of pipes, and many useful functions. Here we are piping our 2,233 line array — cough jq '[.prefixes[].region, .ipv6_prefixes[].region] | length' ip-ranges.json
— to jq’s unique
returning a sorted list of regions.
Lets do one more: what are all the current ipv4 addresses, for EC2 in us-west-2?
jq '.prefixes[] | select(.service=="EC2" and .region=="us-west-2") | .ip_prefix' ip-ranges.json
I’ll leave interpreting it as an exercise for the reader.
jq
is really pretty awesome, but as usual, there is more than one way to do it. If jq
isn’t quite your cup of tea, then there is an entire set of related tools:
- fx Run arbitrary JavaScript on JSON input. Standalone binaries available.
- gron Convert it to and from flat, greppable lists of “path=value” statements.
- jid Explore JSON interactively with filtering queries like jq.
- jj Query and modify values in JSON or JSON lines with a key path.
- jl Query and manipulate JSON using a tiny functional language.
- jp (jmespath) JMESPath
- jshon Create and manipulate JSON using getopt-style command-line options.
- json2 Convert JSON to and from flat, greppable lists of “path=value” statements.
- jsonaxe Create and manipulate JSON with a Python-based DSL. Inspired by jq.
- json Run arbitrary JavaScript on JSON input.
- json-table Convert nested JSON into CSV or TSV for processing in the shell.
- json.tool (Python 3 docs) Validate and pretty-print JSON. This module is part of the standard library of Python 2/3 and is likely to be available wherever Python is installed.
- lobar Explore JSON interactively or process it in batch with a wrapper for
lodash.chain()
. An alternative to jq with a JavaScript syntax. - ramda-cli Manipulate data with the Ramda functional library, and either LiveScript or JavaScript syntax.
- RecordStream Create, manipulate, and output a stream of records, or JSON objects. Can retrieve records from an SQL database, MongoDB, Atom feeds, XML, and other sources.
- rq Create and manipulate it with a DSL inspired by Rust, C and JavaScript. Similar to jq. Supports JSON, YAML and TOML as well as binary formats like Apache Avro and MessagePack.