# Memory Group By

The **Memory Group By** step groups rows in memory from a source step. The resulting rows are grouped based on a specified field or collection of fields. A new row is generated for each group.

This step differs from the [Group By](/pdia-data-integration/pdi-transformation-steps-reference-overview/group-by-landing-page-article.md) step by processing all rows in memory and by handling non-sorted input.

If the number of rows you want to group is too large to fit into memory, use a combination of [Sort rows](/pdia-data-integration/pdi-transformation-steps-reference-overview/sort-rows-transformation-step.md) and Group By.

### General

| Option                            | Definition                                                                                                                 |
| --------------------------------- | -------------------------------------------------------------------------------------------------------------------------- |
| **Step name**                     | Specify the unique name of the Memory Group By step on the canvas. You can customize the name or leave it as the default.  |
| **Always give back a result row** | Select to return a result row even when there is no input row. If no input rows exist, this option returns a count of `0`. |

![Memory Group By dialog](/files/NJXyph3NYjvt3uI8w9Ro)

### Group fields

Use this table to specify the fields you want to group.

* Select **Get fields** to add all fields from the incoming stream.
* Leave this table blank to calculate aggregate functions over the entire dataset.

### Aggregates

Use this table to specify the aggregation method and the name of the resulting field.

| Column      | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Name**    | The name of the aggregate field.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                             |
| **Subject** | The subject on which you want to use an aggregation method.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| **Type**    | <p>The aggregation method. Available methods include:</p><ul><li>Sum</li><li>Average (Mean)</li><li>Median</li><li>Percentile</li><li>Minimum</li><li>Maximum</li><li>Number of values (N)</li><li>Concatenate strings separated by , (comma)</li><li>First non-null value</li><li>Last non-null value</li><li>First value (including null)</li><li>Last value (including null)</li><li>Standard deviation</li><li>Concatenate strings separated by the character specified in the <strong>Value</strong> column</li><li>Number of distinct values</li><li>Number of rows (without field argument)</li></ul> |
| **Value**   | The aggregate value.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                         |

### Metadata injection support

This step supports metadata injection. You can use it with [ETL metadata injection](/pdia-data-integration/pdi-transformation-steps-reference-overview/etl-metadata-injection.md) to pass metadata to your transformation at runtime.


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/memory-group-by.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
