Input tab

PDI_Discover metadata from a text file Input tab

Use the following options in the Input tab to specify details for the input text file:

Option

Description

File name

Select the delimited file you want to evaluate. The file location can be any location supported by a VFS connection. See Connecting to Virtual File Systems

Trim fields

Select this option to identify the field as an integer. Clear the option to indicate that the field is a string.

Header column name detection strategy

Select the strategy you want to use to determine the column names in the file. Once a header row has been determined, any rows of data above that are ignored, the following rows are counted as data. The following strategies are available:- First possible line containing only strings

Selects the first line that contains only string values as the header row. For example, if the data has 5 fields per row, and you set the Maximum number of header rows field to 6, the step searches the first 6 rows in the file for a row containing 5 string fields. The first row encountered with the 5 string fields is selected as the field name header. Any rows after the selected header row are considered as data, even if they are within the 6 header rows specified.

  • First possible line containing any data type

Selects the first line that contains a consistent number of fields as the header row. For example, if the file rows contain 5 fields, the first line containing 5 fields is selected as the header row within the Maximum number of header rows field, regardless of the data types in the rows.

  • Last possible line containing only strings

Selects the last line that contains only string values as the header row. For example, if the data has 5 fields per row, and you set the Maximum number of header rows field to 6, the step searches the first 6 rows in the file for a row containing 5 string fields. The last row encountered with the 5 string fields is selected as the field name header.

  • Last possible line containing any data type

Selects the last line that contains a consistent number of fields as the header row. For example, if the file rows contain 5 fields, the last line containing 5 fields is selected as the header row within the Maximum number of header rows field, regardless of the data types in the rows.

Maximum number of header rows

Enter the maximum number of rows that can be a header. If the file does not have a header row, set this to 0. Only one row can be a header.

Maximum number of footer rows

Enter the maximum number of rows that can be a footer. If the file does not have a footer row, set this to 0. Note: The number of footer rows can only be determined if the entire file is scanned.

Fallback charset

Select the character set of the file. If the step is unable to determine a character set for the file, it defaults to the ISO-8859-1 character set. If an x86 platform is used, the file uses an ASCII character set.

Limit scanned rows

Enter the number of rows to scan in the file before determining the valid set of delimiters and enclosures used in the file. To scan the entire file, enter 0 (zero).

Last updated

Was this helpful?