Examples
Your Pentaho distribution includes several sample transformations and datasets in the design-tools/data-integration/samples/transformations/discover-metadata-from-textfile
directory.
The following code is a portion of the Sample1.txt
file found in the directory:
policyID,county,eq_site_limit,eq_site_deductible,point_longitude
710400,CLAY COUNTY,0,0,-81.71624
703001,CLAY COUNTY,0,0,-81.706865
352792,CLAY COUNTY,0,0,-81.718452
717603,CLAY COUNTY,0,0,-81.718452
937659,SUWANNEE COUNTY,0,0,-82.926659
294022,SUWANNEE COUNTY,0,0,-82.926659
410500,SUWANNEE COUNTY,0,0,-82.926659
524433,SUWANNEE COUNTY,218475,0,-82.926155
972562,SUWANNEE COUNTY,0,0,-82.933777
When the step is run, the file is scanned to determine a consistent number of fields using the tab character, then the semi-colon, then the comma (default). When the Header column name detection strategy is set to First possible line containing only strings, the step identifies the first row as the header row. The following table shows the column names and data types.
policylD2
Integer
county
String
eq_site_limit
BigNumber
eq_site_deductible
Integer
point_longitude
BigNumber
If any of the fields in the first row are numbers or dates, the row is considered data, which means there is no header row in this example.
Last updated
Was this helpful?