Read metadata from Copybook

The Read metadata from Copybook step reads a binary fixed-length copybook definition file and outputs file and column descriptor information as fields in the PDI stream.

You can use the output rows with ETL metadata injection to populate the Copybook Input step. You can also use this step to create a metadata template for multiple data files or to create a data model for a relational database.

For more information, see Copybook steps in PDI.

This step is required to use metadata injection with the Copybook Input step.

Read metadata from Copybook step

General

  • Step name: Specify the unique name of the step on the canvas. You can customize the name or leave it as the default.

Schema

These options define the location of the copybook definition file and include mapping options for the binary data files.

Option
Description

COBOL copybook file path

File path to the copybook definition file. You can enter any VFS or SFTP file path, or select Browse to open the system file browser. After selecting a file, select Validate to verify that the definition file can be accessed and parsed.

COBOL copybook line structure

Line structure of the definition file:

  • Standard columns (6 to 72): Use when the file contains line numbers. The first 6 columns are ignored, and anything beyond column 72 is ignored.

  • Full line: Use when the file does not contain line numbers. |

Binary format

Use these options to describe the binary format of the source data files.

Source architecture

Select the machine architecture of the binary data source files:

  • Big endian (mainframe): Most significant byte first.

  • Little endian: Least significant byte first.

Source charset name

Select the character encoding for the binary data file. Mainframe EBCDIC is typically encoded using the IBM037 or cp1047 character sets.

For more information about encodings and aliases, see Supported Encodingsarrow-up-right.

Packed decimal (COMP-3) convention

Select how COMP-3 packed decimals are parsed at runtime of the Copybook Input step:

  • Strict (default): Validates according to IBM S370FPD. All nibbles (half-bytes), except the sign nibble, must be digits (0–9).

    • Signed packed decimals: sign nibble must be C (positive) or D (negative).

    • Unsigned packed decimals: sign nibble must be F.

  • Lenient: Validates that all nibbles are digits and the sign nibble is a hex value A–F. The sign nibble is used to interpret a negative number only when it is D.

  • Lenient - unchecked: No validation. The sign nibble may be any hex value 0–F. The sign nibble is used to interpret a negative number only when it is D.

Output

  • Extract parent groups?: Select to include metadata for parent groups in the copybook definition. Clear to exclude parent group metadata.

Example

In this example, the accounts.cbl sample copybook definition file is used (available in design-tools/data-integration/samples/transformations/copybook/redefines_example/accounts.cbl).

Sample copybook definition file

The Standard columns (6 to 72) option was selected to match the file format. The Extract parent groups option was selected to include group information.

The following image shows how the data appears in the PDI stream after running the transformation.

Step output to PDI stream

The field_kettle_type column displays the data types that are generated to the PDI stream.

Metadata injection support

All fields of this step support metadata injection. You can use this step with ETL metadata injection to pass metadata to your transformation at runtime.

Last updated

Was this helpful?