# JSON Input

Use the **JSON Input** step to read data from JSON structures, files, or incoming fields.

The step uses a [JSONPath](https://github.com/json-path/JsonPath) expression to extract data and output rows. JSONPath expressions can use either dot notation or square bracket notation.

### Step name

* **Step name**: Specify the unique name of the JSON Input step on the canvas. You can customize the name or leave the default.

You can use **Preview rows** to display the rows generated by this step. The JSON Input step determines output rows based on the options you set. Preview helps you confirm your configuration returns the rows you expect.

### Options

The JSON Input step includes the following tabs:

* **File**
* **Content**
* **Fields**
* **Additional output fields**

#### File tab

Use the **File** tab to specify the JSON source.

| Option                             | Description                                                            |
| ---------------------------------- | ---------------------------------------------------------------------- |
| **Source is from a previous step** | Select to retrieve the source from a field created by a previous step. |

When selected, the following fields are available:

* **Select field**
* **Use field as file names**
* **Read source as URL**
* **Do not pass field downstream**

When cleared, the following fields are available:

* **File or directory**
* **Regular expression**
* **Exclude regular expression**
* **Selected files** | | **Select field** | Field name to use as a source from a previous step. | | **Use field as file names** | Select to treat the source field value as a file name. | | **Read source as URL** | Select to access the source as a URL. | | **Do not pass field downstream** | Select to remove the source field from the output stream.

This improves performance and memory usage for large JSON fields. | | **File or directory** | Source location when the source is not defined in a field.

Select **Browse** to select a file or directory, and then select **Add** to add it to **Selected files**. | | **Regular expression** | Regular expression that matches file names within the specified directory. | | **Exclude regular expression** | Regular expression that excludes file names within the specified directory. |

**Selected files table**

The **Selected files** table lists files or directories used as sources.

You populate the table by selecting **Add** after you specify **File or directory**. The step attempts to connect to the specified location when you select **Add**.

Columns:

* **File/Directory**: Source location.
* **Wildcard (RegExp)**: Wildcards from **Regular expression**.
* **Exclude wildcard**: Wildcards from **Exclude regular expression**.
* **Required**: Whether the location is required.
* **Include subfolders**: Whether subfolders are included.

Actions:

* **Delete**: Removes a source.
* **Edit**: Removes a source and returns it to **File or directory**.
* **Show filename(s)**: Displays file names successfully connected to the step.

#### Content tab

Use the **Content** tab to configure which data to retrieve.

| Option                                | Description                                                                                      |
| ------------------------------------- | ------------------------------------------------------------------------------------------------ |
| **Ignore empty file**                 | Select to skip empty files. If cleared, empty files cause the transformation to fail.            |
| **Do not raise an error if no files** | Select to continue when no files are available to process.                                       |
| **Ignore missing path**               | Select to continue processing when (1) no fields match the JSON path or (2) all values are null. |

If cleared, the step stops processing rows when an error occurs. | | **Default path leaf to null** | Select to return null values for missing paths. | | **Limit** | Limit on the number of records generated by the step. Set to `0` for no limit. | | **Include filename in output** | Select to add a string field containing the file name in the output. | | **Rownum in output** | Select to add an integer field containing the row number in the output. | | **Add filenames to result** | Select to add processed files to the result file list. |

#### Fields tab

Use the **Fields** tab to define which values to extract from the JSON structure.

| Column   | Description                                                  |
| -------- | ------------------------------------------------------------ |
| **Name** | Output field name that maps to the corresponding JSON value. |
| **Path** | JSONPath for the value.                                      |

You can retrieve all records by adding an asterisk (`*`) in the path. Example: `$.mydata.*` | | **Type** | Output field data type. | | **Format** | Optional mask for converting the original value format. See [Common Formats](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/common-formats). | | **Length** | Field length. | | **Precision** | Number of digits after the decimal point for number fields. | | **Currency** | Currency symbol (for example, `$` or `€`). | | **Decimal** | Decimal symbol (period `.` or comma `,`). | | **Group** | Grouping symbol (comma `,` or period `.`). | | **Trim type** | Trim method to apply to a string. | | **Repeat** | Repeats the corresponding value from the last row when a row is empty. |

Select **Select Fields** to populate the table with fields derived from the source.

For general guidance on field metadata, see [Understanding PDI data types and field metadata](https://docs.pentaho.com/pdia-data-integration/understanding-pdi-data-types-and-field-metadata).

**Select fields**

Select **Select Fields** to open the Select Fields window.

Select the checkbox next to each field in your source file you want to include in output. Selected fields are added to the table.

You can search for a field name using the **Search** box.

![Select fields](https://773338310-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYwnJ6Fexn4LZwKRHghPK%2Fuploads%2Fgit-blob-e8623993fa55b01dba98c089473f2bdb1daf026b%2FPDI_JSON_Input_SelectFields_dlg.png?alt=media)

#### Additional output fields tab

Use the **Additional output fields** tab to output additional file metadata.

| Option                      | Description                                                                     |
| --------------------------- | ------------------------------------------------------------------------------- |
| **Short filename field**    | Field that contains the file name without the path but including the extension. |
| **Extension field**         | Field that contains the file extension.                                         |
| **Path field**              | Field that contains the path in operating system format.                        |
| **Size field**              | Field that contains the file size.                                              |
| **Is hidden field**         | Field that indicates whether the file is hidden (Boolean).                      |
| **Last modification field** | Field that contains the last modified date.                                     |
| **Uri field**               | Field that contains the URI.                                                    |
| **Root uri field**          | Field that contains the root part of the URI only.                              |

### Examples

Sample transformations in `design-tools/data-integration/samples/transformations`:

* `JsonInput - read a dynamic file.ktr`
* `JsonInput - read a file.ktr`
* `JsonInput - read incoming stream.ktr`

Sample JSON file used for the Select Fields dialog:

`data-integration/samples/transformations/files/JSON - read nested fields.js`


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/json-input.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
