To bash or not to bash - jq
Sebastian K.
Lead Software Engineer || Telco, Consulting, Finance, Research, Energy Sector
With so many options at a programmer's disposal, one may wonder is it still worth using the Bourne Again Shell or simply bash? After all one may use Go, Python, Node, or some other fancy tooling. So why bash?
Simple shell programs are extremely easy and natural to write and allow you to effortlessly glue together any command line tools available in your system. Hence, a shell program allows you to rapidly prototype ideas.
This series introduces a couple of bash concepts and shows how to apply them to a few command line tools useful to finish some everyday tasks.
Today, we take a look at jq.
jq - sed for JSON data
The first tool to showcase is jq. jq allows you to perform all kinds of operations on JSON data and makes it an excellent tool if all you want is to perform some simple tasks on JSON data without the need to use more sophisticated tools.
Given, the simplified service descriptor below:
service.json { "name": "my-fancy-service", "group": "smart", "version": "1.0.0", "type": "java-backend-rest", "ports": { "service_port": 8080, "mgmnt_port": 9080, "debug_port": 6080 } }
Let's say you want to read the value of the property `name` and replace all occurrences of '-' with '_'. You can do it using the command line:
(1) file="./service.json" (2) name=$(jq '.name' <<< cat "$file") (3) name=$(jq '.ports.service_port' <<< cat "$file") (4) echo "${name//-/_}" # returns "my_fancy_service"
What happens here? In (1) we do a simple variable assignment. "Unlike many other programming languages, Bash does not segregate its variables by "type." Essentially, Bash variables are character strings, but, depending on context, Bash permits arithmetic operations and comparisons on variables. The determining factor is whether the value of a variable contains only digits." [1] In this case, we simply assign a character string representing a relative path to our JSON data above.
In (2) we run jq on the content of the file stored in $file in order to read the value of the property 'name'. The value is assigned to the the variable name using $(...). Note the usage of <<<, which is called a here-string. It consists of nothing more than a COMMAND <<< $WORD, where $WORD is expanded and the result is passed to standard input (stdin) of COMMAND. In our case to jq. Using the dot notation we can easily navigate through the document fo filter values as seen in (3).
Finally, in (4) we use the built-in command echo to write the argument to standard output (stdout). Bash also supports a couple of variable expansions and substring replacement mechanism, such as ${var//pattern/replacement}. Upon expansion all occurrences of '-' are replaced with '-'. You can read more about parameter substitution in [2].
What if we want to do something more complicated like ensuring that a default value is present for a certain property, e.g. { "platform" = "fancyworld"} or like computing a default value based on other value son the document?
Again, jq offers some powerful concepts to model such queries:
resolver=$(cat <<EOF . | if .platform then . else . += {"platform": "fancyworld"} end | if .url then . else . += {"url": "https://example.com/repo/\(.group)/\(.name)" } end EOF ) resolved_json=$(jq "$resolver" "$file") echo "$resolved_json"
This snippet requires an explanation. First, I use a special purpose code block known as here document [3] to define a multi line string that is feed to the command cat using a form of I/O redirection. The end of the multi line string is denoted by the term EOF. The result is assigned to the variable resolver.
The multi line string contains starts with jq expression '.', which takes the input and produces it unchanged as output. The output is fed to two filters sequentially. The first filter checks whether the property .platform exists or not. If not we add the property with a default value to the document. The second filter is a bit more interesting as it shows how to reference other values from the same document. Here, we reference the values of the properties .group and .name using the syntax \(.property) to add them to a property url in case it is missing.
Finally, we apply our jq expression to the JSON data previously defined to compute the following result:
service.json { "name": "my-fancy-service", "group": "smart", "version": "1.0.0", "type": "java-backend-rest", "url": "https://example.com/repo/smart/my-fancy-service" "platform": "fancyworld", "ports": { "service_port": 8080, "mgmnt_port": 9080, "debug_port": 6080 } }
As you can see, Bash allows to glue together different commands such as jq or cat to run simple experiments in a matter of seconds/minutes.
If you want to learn more about Bash, I also recommend to read about the Bash Architecture in [4].
References
[1] https://tldp.org/LDP/abs/html/untyped.html
[2] https://www.tldp.org/LDP/abs/html/parameter-substitution.html
[3] https://tldp.org/LDP/abs/html/here-docs.html
[4] https://www.aosabook.org/en/bash.html