This documentation describes an old release, version 3.3.5. Documentation for the latest release, 3.6.2, can be found here.

JSON¶

Zuar Runner’s JSON plugin pipes data from any JSON file into a relational database.

Example use case¶

Let’s take a simple JSON file with nested data and convert the data into tables in a relational database.

Example JSON file¶

Here’s what the source JSON file (people_pets.json) looks like:

[
  {
    "id": 1,
    "name": "Justin",
    "pets": [
      {
        "id": 1,
        "name": "Bear",
        "pet": {
          "type": "dog",
          "breed": "Goldendoodle"
        }
      },
      {
        "id": 2,
        "name": "Birdie",
        "pet": {
          "type": "dog",
          "breed": "Goldendoodle"
        }
      }
    ]
  },
  {
    "id": 2,
    "name": "Matt",
    "pets": []
  },
  {
    "id": 3,
    "name": "Ben",
    "pets": [
      {
        "id": 3,
        "name": "Zuca",
        "pet": {
          "type": "dog",
          "breed": "Cavapoo"
        }
      }
    ]
  }
]

Expected Relational Database Results¶

Because the JSON file has nested data in the pets array ([ ]), we will need to create two tables in the database.

Table 1 will include top level key/value pairs: id, name. The resulting table should have 3 rows.
Table 2 will include data in the pets array (id, name, and pet object’s data). Table 2 will also include data from the parent object (id) so we can join the two database tables. The resulting table should have 3 rows.

Table 1 expected result:

id	name
1	Justin
2	Matt
3	Ben

Table 2 expected result:

id	name	pet__type	pet__breed	person_id
1	Bear	dog	Goldendoodle	1
2	Birdie	dog	Goldendoodle	1
3	Zuca	dog	Cavapoo	3

Create a Zuar Runner JSON Job¶

The JSON file(s) must exist in Zuar Runner’s file system. Use the file manager to manually add a JSON file to Zuar Runner.

The JSON input requires creating a job by hand, so create a job using the Generic plugin. Set the job’s type to io.

Example Zuar Runner Job Configurations¶

Table 1 Zuar Runner job configuration:

{
    "input": {
        "use": "flatfile.iov2#JsonInput",
        "source": "/var/mitto/data/people_pets.json"
    },
    "output": {
        "tablename": "people",
        "use": "call:mitto.iov2.db#todb",
        "schema": "json",
        "dbo": "postgresql://db/analytics"
    },
    "steps": [
        {
            "use": "mitto.iov2.steps#Input",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#ExtraColumnsTransform"
                },
                {
                    "use": "mitto.iov2.transform#ColumnsTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CreateTable"
        },
        {
            "use": "mitto.iov2.steps#Output",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#FlattenTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CollectMeta"
        }
    ]
}

This example job take the source JSON file (people_pets.json) and creates a database table (json.people) in Zuar Runner’s internal PostgreSQL database (localhost). It creates columns from all the “top level” keys excluding the arrays ([ ]).

Table 2 Zuar Runner job configuration:

{
    "input": {
        "source": "/var/mitto/data/people_pets.json",
        "use": "flatfile.iov2#JsonInput"
    },
    "output": {
        "dbo": "postgresql://db/analytics",
        "schema": "json",
        "tablename": "people__pets",
        "use": "call:mitto.iov2.db#todb"
    },
    "steps": [
        {
            "use": "mitto.iov2.steps#Input",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#PluckV2Transform",
                    "jpath": "$.pets[*]",
                    "members": [
                        {
                            "name": "person_id",
                            "value": "$.id"
                        }
                    ]
                },
                {
                    "use": "mitto.iov2.transform#ExtraColumnsTransform"
                },
                {
                    "use": "mitto.iov2.transform#ColumnsTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CreateTable"
        },
        {
            "use": "mitto.iov2.steps#Output",
            "transforms": [
                {
                    "use": "mitto.iov2.transform#FlattenTransform"
                }
            ]
        },
        {
            "use": "mitto.iov2.steps#CollectMeta"
        }
    ]
}

This example job take the source JSON file (people_pets.json) and creates a database table (json.people__pets) in Zuar Runner’s internal PostgreSQL database (localhost). Based on the jpath and members parameters in the steps, it creates columns from all the keys inside the pets array and the “top level” id key.

Jpath¶

The jpath parameter let’s you define exactly what section of the source JSON file to pull from. This is necessary for Table 2 because of the nested data in the pets array. Learn more about JSONPath expressions below.

Members¶

The members parameter let’s you add additional columns from the source JSON file to the resulting database table. In our case, we need to add the id of each person to the people__pets table in order for us to join the two tables together. members also uses jpath in it’s value key, and you can name the resulting column by adjusting the name key.

JSONPath Expressions¶

To understand how to use JSONPath to pick specific sections of data out of a JSON object you can visit: https://goessner.net/articles/JsonPath/

To interactively learn how to use JSONPath syntax you can visit: https://jsonpath.com/