# Using HBase Row Decoder with Pentaho MapReduce

The HBase Row Decoder step is designed specifically for use in MapReduce transformations to decode the key and value data that is output by the TableInputFormat. The key output is the row key from HBase. The value is an HBase result object containing all the column values for the row.

The following use case shows you how to configure Pentaho MapReduce to use the TableInputFormat for reading data from HBase. It also shows you how to configure a MapReduce transformation to process that data using the HBase Row Decoder step.

**Note:** To process HBase data using incoming key and value data to produce a specified mapping, you will need to configure Hadoop to access HBase.

First, create a Pentaho MapReduce job entry that includes a transformation which uses a MapReduce Input step and an HBase row decoder step, as shown below:

![MapReduce transformation example](/files/IuQRWNKyB2RnTaHvSXIr)

In the transformation, open the MapReduce Input step. Configure the **Key field** and **Value field** to produce a serialized result by selecting **Serializable** in the **Type** field:

![MapReduce input step example](/files/8kvFe18Uh02x91gTIz5r)

Next, open the HBase row decoder step and set the **Key field** to use the `key` and the **HBase result field** to use the `value` produced by the MapReduce Input step.

![HBase row decoder step example](/files/3ASVGQj6AQQJ2CFaC8PN)

Then, define or load a mapping in the **Create/Edit mappings** tab. Note that once defined (or loaded), this mapping is captured in the transformation metadata.

![HBase row decoder step, example 2](/files/OsyuomanAr9jg8MLJpe9)

Next, configure Pentaho MapReduce job entry to ensure that input splits are created using the TableInputFormat. Define the **Input Path** and **Input format** fields in the **Job Setup** tab, as shown below.

![Pentaho MapReduce entry example](/files/SpGOLjPmQhNSGK71vSvU)

Finally, in the **User Defined** tab, assign a **Name** and **Value** for each property shown in the table below to configure the scan performed by the TableInputFormat:

| Name                                   | Value                                                                                                                                                   |
| -------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `hbase.mapred.inputtable`              | The name of the HBase table to read from. (Required)                                                                                                    |
| `hbase.mapred.tablecolumns`            | The space delimited list of columns in ColFam:ColName format. Note that if you want to read all the columns from a family, omit the ColName. (Required) |
| `hbase.mapreduce.scan.cachedrows`      | (Optional) The number of rows for caching that will be passed to scanners.                                                                              |
| `hbase.mapreduce.scan.timestamp`       | (Optional) Time stamp used to filter columns with a specific time stamp.                                                                                |
| `hbase.mapreduce.scan.timerange.start` | (Optional) Starting time stamp to filter columns within a given starting range.                                                                         |
| `hbase.mapreduce.scan.timerange.end`   | (Optional) End time stamp to filter columns within a given ending range.                                                                                |

When you execute the job, the output is the row key from HBase and the value is an HBase result object containing all the column values for that row.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/9.3-data-integration/pdi-transformation-steps-reference-overview/hbase-row-decoder-pdi/using-hbase-row-decoder-with-pentaho-mapreduce.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
