Retrieve the schema of a nested JSON object

Let's assume this is the original json file:

{    
    "name": "tom",
    "age": 12,
    "visits": {
        "2017-01-25": 3,
        "2016-07-26": 4,
        "2016-01-24": 1
    }
}

      

I want to receive:

[
  "age",
  "name",
  "visits.2017-01-25",
  "visits.2016-07-26",
  "visits.2016-01-24"
]

      

I can extract the keys using:, jq '. | keys' file.json

but that skips the nested fields. How do I enable them?

+4


source to share


3 answers


On your input the call:

jq 'leaf_paths | join(".")'

      

produces:

"name"
"age"
"visits.2017-01-25"
"visits.2016-07-26"
"visits.2016-01-24"

      

If you want to include "visits" use paths

. If you want to get the result as a JSON array, enclose the filter in square brackets: [...]

If your input can include arrays, then unless you are using jq 1.6 or later, you need to explicitly convert integer indices to strings; also, since it is leaf_paths

deprecated, you can use def. Result:

jq 'paths(scalars) | map(tostring) | join(".")'

      

allpaths

To include paths to zero, you can use allpaths

defined like this:

def allpaths:
  def conditional_recurse(f):  def r: ., (select(.!=null) | f | r); r;
  path(conditional_recurse(.[]?)) | select(length > 0);

      

Example:



{"a": null, "b": false} | allpaths | join(".")

      

produces:

"a"
"b"

      

all_leaf_paths

Assuming JQ is version 1.5 or higher, we can get to all_leaf_paths

by following the strategy used in buildins.jq, that is, by adding these definitions:

def allpaths(f):
  . as $in | allpaths | select(. as $p|$in|getpath($p)|f);

def isscalar:
  . == null or . == true or . == false or type == "number" or type == "string";

def all_leaf_paths: allpaths(isscalar);

      

Example:



{"a": null, "b": false, "object":{"x":0} } | all_leaf_paths | join(".")

      

produces:

"a"
"b"
"object.x"

      

+4


source


This does what you want, but it doesn't return data in an array, but it should be an easy modification:

https://github.com/ilyash/show-struct



you can also check this page: https://ilya-sher.org/2016/05/11/most-jq-you-will-ever-need/

0


source


Some time ago I wrote a structured schema inference engine that creates simple structured diagrams that reflect the JSON documents in question, for example, for the JSON example given here, a logical schema:

{
  "name": "string",
  "age": "number",
  "visits": {
    "2017-01-25": "number",
    "2016-07-26": "number",
    "2016-01-24": "number"
  }
}

      

This is not exactly the format requested in the original post, but for large collections of objects, it provides a useful overview.

More importantly, there is now an additional validator to check if the JSON document collection matches the structural schema. The validator validates schemas written in JESS (JSON extended structure schemas), an extended set of simple structure schemas (SSS) generated by the schema inference engine.

(The idea is that you can use SSS as a starting point for adding more complex constraints, including recursive constraints, intra-document referential constraints, etc.)

For reference, this is how this SSS for your sample.json will be generated using the "schema" module :

jq 'include "schema"; schema' source.json > source.schema.json

      

And to check source.json against SSS or ESS:

JESS --schema  source.schema.json  source.json

      

0


source







All Articles