Calculator

The Calculator step provides predefined functions that you can run on input field values. Use Calculator as a quick alternative to custom JavaScript for common calculations.

To use Calculator, specify the input fields and the calculation type, and then write results to new fields. You can also remove temporary fields from the output after all values are calculated.

Step name

Step name: Specify the unique name of the Calculator step on the canvas. You can customize the name or leave it as the default.

Options

Fill in the following fields in the table. The values in Field A, Field B, and Field C are used by the selected calculator function in the Calculation column.

Column

Description

New field

Specify the name of the field to create.

Calculation

When you click or press in this field, the Select the calculation type dialog box appears. Select the calculator function to use in the transformation. For details, see Calculator functions list. Use the Filter field to search for a specific function.

Field A, Field B, and Field C

Enter the field(s) that the calculator function uses.

Value type

Select the field’s data type from the list, or enter it manually.

Length

Specify the length of the field, based on the Value type (for example: Number = total number of significant figures; String = total string length; Date = length of printed output, such as yyyy for a year).

Precision

Specify the number of floating-point digits for Number fields.

Remove

Remove this field from the output after all values are calculated. This option is useful for removing temporary values. Select Y or N.

Conversion mask

Specify a format for the field, such as a date format. For common date formats you can use in this step, see Common Formats.

Decimal symbol

Specify the decimal symbol to use (dot . or comma ,). For example: 5,000.00 or 5.000,00.

Grouping symbol

Specify the thousands separator to use (dot . or comma ,). For example: 5,000 or 5.000.

Currency symbol

Specify the currency symbol to use, such as $ or €.

The Calculator step also provides the following option:

Throw an error on non existing files: Select this option to fail the transformation when a calculation references a file that does not exist (for example, checksum functions).

Calculator functions list

Use the following list to learn about each calculator function available in the Calculator step.

Calculator functions list

Function

Description

Required fields

Set field to constant value A

Creates a field with a constant value.

Create a copy of field A

Creates a copy of a field with the given field value.

A + B

A plus B.

A and B

A - B

A minus B.

A and B

A * B

A multiplied by B.

A and B

A / B

A divided by B.

A and B

A * A

The square of A.

SQRT( A )

The square root of A.

100 * A / B

The percentage of A in B.

A and B

A - ( A * B / 100 )

Subtracts B% of A.

A and B

A + ( A * B / 100 )

Adds B% to A.

A and B

A + B *C

Adds A and B times C.

A, B, and C

SQRT ( A*A + B*B )

Calculates (A2 + B2).

A and B

ROUND ( A )

Returns the closest integer to the argument. The result is rounded to an integer by adding 1/2, taking the floor of the result, and casting the result to type int: floor(a + 0.5). If you need the rounding method “Round half to even”, use ROUND( A, B ) with no decimals (B=0).

ROUND ( A, B )

Rounds A to the nearest positive infinity number (“Round half to ceiling”). For the pre-6.0 rounding behavior, see Rounding method for the Round (A, B) function.

A and B

STDROUND( A )

Rounds A to the nearest integer using “Round half away from zero” (standard/common rounding).

STDROUND( A, B )

Uses the same rounding method as STDROUND(A), but with B decimals.

A and B

CEIL( A )

Maps a number to the smallest following integer.

FLOOR( A )

Maps a number to the largest previous integer.

NVL( A, B )

If A is not null, returns A; otherwise returns B. (Sometimes the value is an empty string instead of null.)

A and B

Date A + B Days

Adds B days to Date field A. Only integer values for B are supported. For non-integer calculations, add a second calculation with hours.

A and B

Year of date A

Calculates the year of date A.

Month of date A

Calculates the month of date A.

Day of year of date A

Calculates the day of year (1–365).

Day of month of date A

Calculates the day of month (1–31).

Day of week of date A

Calculates the day of the week (1–7).

Week of year of date A

Calculates the week of year (1–54).

ISO8601 Week of year of date A

Calculates the week of the year in ISO 8601 style.

ISO8601 Year of date A

Calculates the year in ISO 8601 style.

Byte to hex encode of string A

Encodes bytes in a string to a hexadecimal representation.

Hex to byte decode of string A

Decodes bytes in a string from its hexadecimal representation (add a leading 0 when A is of odd length).

Char to hex encode of string A

Encodes characters in a string to a hexadecimal representation.

Hex to char decode of string A

Decodes a string from its hexadecimal representation (add a leading 0 when A is of odd length).

Checksum of a file A using CRC-32

Calculates the checksum of a file using CRC-32.

Checksum of a file A using Adler-32

Calculates the checksum of a file using Adler-32.

Checksum of a file A using MD5

Calculates the checksum of a file using MD5.

Checksum of a file A using SHA-1

Calculates the checksum of a file using SHA-1.

Levenshtein Distance (source A and target B)

Calculates the Levenshtein Distance.

A and B

Metaphone of A (phonetics)

Calculates the Metaphone of A.

Double metaphone of A (phonetics)

Calculates the Double Metaphone of A.

Absolute value ABS ( A )

Calculates the absolute value of A.

Remove time from a date A

Removes the time value of A. If your region changes Daylight Saving Time at midnight, there can be one date per year where this function fails with IllegalArgumentException: HOUR_OF_DAY: 0 -> 1.

Date A - Date B (in days)

Calculates the difference, in days, between A and B date fields.

A and B

A + B + C

A plus B plus C.

A, B, and C

First letter of each word of a string A in capital

Capitalizes the first letter of each word in a string.

UpperCase of a string A

Converts a string to uppercase.

LowerCase of a string A

Converts a string to lowercase.

Mask XML content from string A

Escapes XML content; replaces characters with & values.

Protect (CDATA) XML content from string A

Wraps the string in a CDATA section.

Remove CR from a string A

Removes carriage returns from a string.

Remove LF from a string A

Removes line feeds from a string.

Remove CRLF from a string A

Removes carriage returns and line feeds from a string.

Remove TAB from a string A

Removes tab characters from a string.

Return only digits from string A

Returns only digits (0–9) from a string.

Remove digits from string A

Removes all digits (0–9) from a string.

Return the length of a string A

Returns the length of a string.

Load file content in binary

Loads the content of the file in field A to a binary data type (for example, images).

Add time B to date A

Adds time B to date A and returns a date/time value.

A and B

Quarter of date A

Returns the quarter (1–4) of the date.

variable substitution in string A

Substitutes variables within a string.

Unescape XML content

Unescapes XML content from a string.

Escape HTML content

Escapes HTML within a string.

Unescape HTML content

Unescapes HTML within a string.

Escape SQL content

Escapes characters in a string so it can be used in an SQL query.

Date A - Date B (working days)

Calculates the difference between Date field A and Date field B (working days only, Monday–Friday).

A and B

Date A + B Months

Adds B months to Date field A. Only integer values for B are supported. For non-integer calculations, add a second calculation with days.

A and B

Check if an XML file A is well formed

Validates an XML file.

Check if an XML string A is well formed

Validates an XML string.

Get encoding of file A

Guesses the best encoding (UTF-8) for the given file.

DamerauLevenshtein distance between String A and String B

Calculates the Damerau-Levenshtein distance between strings.

A and B

NeedlemanWunsch distance between String A and String B

Calculates the Needleman-Wunsch distance between strings.

A and B

Jaro similitude between String A and String B

Calculates the Jaro similarity.

A and B

JaroWinkler similitude between String A and String B

Calculates the Jaro-Winkler distance.

A and B

SoundEx of String A

Encodes a string into a Soundex value.

RefinedSoundEx of String A

Retrieves the Refined Soundex code for a string.

Date A + B Hours

Adds B hours to a date field. Only integer values for B are supported. For non-integer calculations, add a second calculation with minutes.

A and B

Date A + B Minutes

Adds B minutes to a date field. Only integer values for B are supported. For non-integer calculations, add a second calculation with seconds.

A and B

Date A - Date B (milliseconds)

Subtracts B milliseconds from Date field A.

A and B

Date A - Date B (seconds)

Subtracts B seconds from Date field A. Only integer values for B are supported. For non-integer calculations, add a second calculation with milliseconds.

A and B

Date A - Date B (minutes)

Subtracts B minutes from Date field A. Only integer values for B are supported. For non-integer calculations, add a second calculation with seconds.

A and B

Date A - Date B (hours)

Subtracts B hours from Date field A. Only integer values for B are supported. For non-integer calculations, add a second calculation with minutes.

A and B

Hour of Day of Date A

Extracts the hour part of a date.

Minute of Hour of Date A

Extracts the minute part of a date.

Second of Minute of Date A

Extracts the second part of a date.

ROUND_CUSTOM( A , B )

Rounds A using a rounding mode indicated by B. If B is a decimal value, it is converted to its floor value (for example, 5.7 becomes 5). Rounding modes: 0 UP (away from zero); 1 DOWN (toward zero); 2 CEILING (toward positive infinity); 3 FLOOR (toward negative infinity); 4 HALF_UP; 5 HALF_DOWN; 6 HALF_EVEN; 7 UNNECESSARY.

A and B

ROUND_CUSTOM( A , B , C )

Rounds A to B decimal places using rounding mode C. Example: if A is 1667.2342, B is 3, and C is 2 (CEILING), the result is 1667.235.

A, B, and C

Date A + B Seconds

Adds the number of milliseconds offset from the Epoch of A to the same of B, where A and B are dates.

A and B

Remainder of A / B

Returns the remainder of dividing B into A. For example: A=100, B=56 → 44; A=2.5, B=2.3 → 0.2.

A and B

Troubleshooting

The following are frequently asked questions about the Calculator step.

Length and precision

Question

I made a transformation using the A / B function and it rounded incorrectly. I entered integers in Field A and Field B, but my result type was a number, so I would expect the integers to be converted to numbers before executing the division.

For example, when I execute 28/222, the result is 0.0 instead of 0.1261. It seems the result type is ignored. If I change the values in Field A and Field B to numbers (6, 4), my result is 0.12612612612612611, which still ignores the result type (4 places after the decimal point).

Suggested solution

Length and precision are metadata properties. PDI converts to the required metadata type when results are written to a target (such as a file or table), not during the transformation execution.

If you want to round to a specific precision, do the rounding in another step.

Also, rounding floating-point numbers can be misleading because floating-point values are stored as approximations. For example, 0.1261 might be stored as 0.126099999999 or 0.1261000000001.

This behavior does not apply to the BigNumber data type.

For example, when numbers are stored in an output table, the calculation is rounded using BigDecimals. This behavior is also true for the Text File Output step. If you specify Integer as the result type, the internal number format is retained.

See Text File Output for details.

Data types

Question

How do the data types work internally?

Suggested solution

If you multiply an integer and a number, the result is always rounded. The Calculator step uses the data type of the value on the left side of the calculation (the value in Field A) as the driver for the calculation.

If you want more precision, put the number value in Field A. Alternatively, change Field A to the Number data type.

Rounding method for the Round (A, B) function

Starting in Pentaho 6.0, the Round (A, B) function rounds to the nearest positive infinity number (“Round half to ceiling”). Before version 6.0, Round (A, B) used the “Round half to even” method (also called unbiased rounding, bankers’ rounding, or Gaussian rounding).

“Round half to even” is the default rounding mode in IEEE 754 computing functions and operators.

Use the following steps to override the default “Round half to ceiling” method and use the “Round half to even” method.

Stop the Pentaho Server

Stop the Pentaho Server.

Open `kettle.properties`

Open the kettle.properties file in a text editor.

By default, kettle.properties is stored in your home directory or in the .pentaho directory.

Set the rounding properties

Add the following lines:

ROUND_2_MODE=ROUND_HALF_EVEN
ROUND_2_MODE_BACKWARD_COMPATIBILITY_VALUE=ROUND_HALF_EVEN

Save the file

Save the file and close the editor.

Start the Pentaho Server

Start the Pentaho Server.

PreviousAvro Output NextCatalog Input

Last updated 2 months ago

Was this helpful?

hashtagStep name

hashtagOptions

hashtagCalculator functions list

hashtagTroubleshooting

hashtagLength and precision

hashtagData types

hashtagRounding method for the Round (A, B) function

hashtagStop the Pentaho Server

hashtagOpen kettle.properties

hashtagSet the rounding properties

hashtagSave the file

hashtagStart the Pentaho Server

Step name

Options

Calculator functions list

Troubleshooting

Length and precision

Data types

Rounding method for the Round (A, B) function

Stop the Pentaho Server

Open `kettle.properties`

Set the rounding properties

Save the file

Start the Pentaho Server