# Read metadata from Copybook

The Read metadata from Copybook step reads a binary fixed-length copybook definition file and outputs file and column descriptor information as fields in the PDI stream.

You can use the output rows with [ETL metadata injection](/pdia-data-integration/pdi-transformation-steps-reference-overview/etl-metadata-injection.md) to populate the [Copybook Input](/pdia-data-integration/pdi-transformation-steps-reference-overview/copybook-input-pdi-step.md) step. You can also use this step to create a metadata template for multiple data files or to create a data model for a relational database.

For more information, see [Copybook steps in PDI](/pdia-data-integration/extracting-data-into-pdi/copybook-steps-in-pdi-cp.md).

This step is required to use metadata injection with the Copybook Input step.

![Read metadata from Copybook step](/files/DMAkAfuLqd1wuKiV9U7X)

### General

* **Step name**: Specify the unique name of the step on the canvas. You can customize the name or leave it as the default.

### Schema

These options define the location of the copybook definition file and include mapping options for the binary data files.

| Option                            | Description                                                                                                                                                                                                                                          |
| --------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **COBOL copybook file path**      | File path to the copybook definition file. You can enter any VFS or SFTP file path, or select **Browse** to open the system file browser. After selecting a file, select **Validate** to verify that the definition file can be accessed and parsed. |
| **COBOL copybook line structure** | Line structure of the definition file:                                                                                                                                                                                                               |

* **Standard columns (6 to 72)**: Use when the file contains line numbers. The first 6 columns are ignored, and anything beyond column 72 is ignored.
* **Full line**: Use when the file does not contain line numbers. |

### Binary format

Use these options to describe the binary format of the source data files.

#### Source architecture

Select the machine architecture of the binary data source files:

* **Big endian (mainframe)**: Most significant byte first.
* **Little endian**: Least significant byte first.

#### Source charset name

Select the character encoding for the binary data file. Mainframe EBCDIC is typically encoded using the `IBM037` or `cp1047` character sets.

For more information about encodings and aliases, see [Supported Encodings](https://docs.oracle.com/javase/8/docs/technotes/guides/intl/encoding.doc.html).

#### Packed decimal (COMP-3) convention

Select how COMP-3 packed decimals are parsed at runtime of the [Copybook Input](/pdia-data-integration/pdi-transformation-steps-reference-overview/copybook-input-pdi-step.md) step:

* **Strict** (default): Validates according to IBM S370FPD. All nibbles (half-bytes), except the sign nibble, must be digits (`0–9`).
  * Signed packed decimals: sign nibble must be `C` (positive) or `D` (negative).
  * Unsigned packed decimals: sign nibble must be `F`.
* **Lenient**: Validates that all nibbles are digits and the sign nibble is a hex value `A–F`. The sign nibble is used to interpret a negative number only when it is `D`.
* **Lenient - unchecked**: No validation. The sign nibble may be any hex value `0–F`. The sign nibble is used to interpret a negative number only when it is `D`.

### Output

* **Extract parent groups?**: Select to include metadata for parent groups in the copybook definition. Clear to exclude parent group metadata.

### Example

In this example, the `accounts.cbl` sample copybook definition file is used (available in `design-tools/data-integration/samples/transformations/copybook/redefines_example/accounts.cbl`).

![Sample copybook definition file](/files/EBP9FegAMgGrvoTfCv2c)

The **Standard columns (6 to 72)** option was selected to match the file format. The **Extract parent groups** option was selected to include group information.

The following image shows how the data appears in the PDI stream after running the transformation.

![Step output to PDI stream](/files/k2tEvRSsDB0yyYAmVhPV)

The `field_kettle_type` column displays the data types that are generated to the PDI stream.

### Metadata injection support

All fields of this step support metadata injection. You can use this step with [ETL metadata injection](/pdia-data-integration/pdi-transformation-steps-reference-overview/etl-metadata-injection.md) to pass metadata to your transformation at runtime.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/read-metadata-from-copybook.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
