Content tab

Content tab

In the Content tab, you can specify the format of the text files that are being read.

Option
Description

Filetype

Select either CSV or Fixed length. Based on this selection, the PDI client launches a different helper GUI when you click Get Fields in the Fields tab.

Separator

One or more characters that separate the fields in a single line of text. Typically, this is a semicolon ( ; ) or tab.

Enclosure

Some fields can be enclosed by a pair of strings to allow separator characters in fields. The enclosure string is optional.

Allow breaks in enclosed fields

Not implemented.

Escape

Specify an escape character (or characters) if you have these types of characters in your data. If you have a backslash ( / ) as an escape character, the text Not the nine o\'clock news (with a single quote \[ ' \] as the enclosure) is parsed as Not the nine o'clock news.

Header & Number of header lines

Select if your text file has a header row (first lines in the file). You can specify the number of times the header line appears.

Footer & Number of footer lines

Select if your text file has a footer row (last lines in the file). You can specify the number of times the footer row appears.

Wrapped lines & Number of times wrapped

Select if you work with data lines that have wrapped beyond a specific page limit. Headers and footers are never considered wrapped.

Paged layout (printout), Number of lines per page, & Document header lines

Use these options as a last resort when working with texts meant for printing on a line printer. Use the number of document header lines to skip introductory texts and the number of lines per page to position the data lines.

Compression

Use this field if your text file is in a ZIP or GZIP archive. Only the first file in the archive is read.

No empty rows

Select if you do not want to send empty rows to the next steps.

Include filename in output?

Select if you want the file name to be part of the output.

Filename fieldname

Enter the name of the field that contains the file name.

Rownum in output?

Select if you want the row number to be part of the output.

Rownum fieldname & Rownum by file?

Enter the name of the field that contains the row number.

Format

Can be either DOS, UNIX, or mixed. UNIX files have lines that are terminated by line feeds. DOS files have lines separated by carriage returns and line feeds. If you specify mixed, no verification is done.

Encoding & Limit

Specify the text file encoding to use. Leave blank to use the default encoding on your system. To use Unicode, specify UTF-8 or UTF-16. On first use, the PDI client searches your system for available encodings.

Be lenient when parsing dates?

Clear check box if you want strict parsing of data fields. If selected, dates like Jan 32nd become Feb 1st.

The date format Locale

This locale is used to parse dates that have been written in full such as February 2nd, 2016. Parsing this date on a system running in the French (fr_FR) locale would not work because February is called Février in that locale.

Add filenames to result

Adds filenames to generate a filenames list.

Last updated

Was this helpful?