# MongoDB Output

The **MongoDB Output** step writes data to a MongoDB collection.

### Step name

**Step name** specifies the unique name of the step on the canvas. You can change it.

### Configure the step (tabs)

#### Configure connection tab

You can configure the connection using either a connection string or individual connection fields.

**Connection string**

Select **Connection String**, then enter a connection string URI. Verify the connection by selecting **Test Connection**.

![MongoDB output - Connection String option](https://773338310-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYwnJ6Fexn4LZwKRHghPK%2Fuploads%2Fgit-blob-6ce98e9eebd473e790619a43e99a7134f5a808f4%2FPDI%20MongoDB%20output%20connection%20tab%20string.png?alt=media)

For connection string formats and options, see the [MongoDB documentation](https://www.mongodb.com/docs/manual/reference/connection-string/).

Common examples:

| Connection type                                     | Connection string format                                                                                                                |
| --------------------------------------------------- | --------------------------------------------------------------------------------------------------------------------------------------- |
| SSL                                                 | `mongodb://<hostname>:<port>/?tls=true`                                                                                                 |
| SSL and LDAP                                        | `mongodb://<username>:<password>@<hostname>:<port>/?tls=true&authSource=$external&authMechanism=PLAIN`                                  |
| LDAP                                                | `mongodb://<username>:<password>@<hostname>:<port>/?authSource=$external&authMechanism=PLAIN`                                           |
| SSL and LDAP cluster servers with `replicaSet`      | `mongodb://<username>:<password>@<hostname>:<port>/?tls=true&authsource=$external&authMechanism=PLAIN&replicaSet=rs0`                   |
| SSL and LDAP with `replicaSet` and `readPreference` | `mongodb://<username>:<password>@<hostname>/?tls=true&authSource=$external&authMechanism=PLAIN&replicaSet=rs0&readPreference=secondary` |
| Kerberos                                            | `mongodb://<service-principal>@<hostname>:<port>/?authSource=$external&authMechanism=GSSAPI`                                            |
| Kerberos and SSL                                    | `mongodb://<service-principal>@<hostname>:<port>/?authSource=$external&authMechanism=GSSAPI&tls=true`                                   |
| Atlas Cloud/SAAS                                    | `mongodb+srv://<username>:<password>@mycluster.qj8y0.mongodb.net/myFirstDatabase?retryWrites=true&w=majority`                           |

**Configure fields**

If you select **Configure Fields**, specify these values:

![MongoDB output - Configure Fields option](https://773338310-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYwnJ6Fexn4LZwKRHghPK%2Fuploads%2Fgit-blob-03273a88261732b4231e6f52e17f6a133b095e47%2FPDI%20MongoDB%20Output%20configure%20tab%20fields.png?alt=media)

| Field                                  | Description                                                                                                                                                                                                        |
| -------------------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Host name(s) or IP address(es)**     | Host names or IP addresses for MongoDB instances. You can specify ports by using `host:port`. Separate multiple hosts with commas.                                                                                 |
| **Port**                               | Default port (used if ports are not specified with hosts). Default: `27017`.                                                                                                                                       |
| **Enable SSL connection**              | Connects to a MongoDB server configured for SSL.                                                                                                                                                                   |
| **Use all replica set members/mongos** | Uses all replica sets when multiple hosts are specified. If a replica set has more than one host, the Java driver discovers hosts automatically and tries the next replica set if the selected set is unavailable. |
| **Authentication database**            | Authentication database.                                                                                                                                                                                           |
| **Username**                           | Username required to access the database. For Kerberos, enter the Kerberos principal.                                                                                                                              |
| **Password**                           | Password for the username (not required for Kerberos).                                                                                                                                                             |
| **Authenticate mechanism**             | Authentication method. Values include `SCRAM-SHA-1`, `MONGODB-CR`, and `PLAIN`.                                                                                                                                    |
| **Authenticate using Kerberos**        | Enables Kerberos authentication.                                                                                                                                                                                   |
| **Connection timeout**                 | Connection timeout in milliseconds. Leave blank for no timeout.                                                                                                                                                    |
| **Socket timeout**                     | Write operation timeout in milliseconds. Leave blank for no timeout.                                                                                                                                               |

#### Output options tab

Use the **Output options** tab to control how PDI writes to MongoDB.

If the specified collection does not exist, PDI creates it before inserting documents.

![MongoDB Output - Output options tab](https://773338310-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYwnJ6Fexn4LZwKRHghPK%2Fuploads%2Fgit-blob-4c89c2ef5c48b95a45d69a363678a289f1522cba%2FPDI_TransStep_MongoDB_Output_Output_Options_Tab.png?alt=media)

| Option                                        | Description                                                                                                                                                                                                                                                                                                                                                                                                |
| --------------------------------------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Database**                                  | Target database. Select **Get DBs** to retrieve database names.                                                                                                                                                                                                                                                                                                                                            |
| **Collection**                                | Target collection. Select **Get Collections** to retrieve collection names. If the collection does not exist, PDI creates it.                                                                                                                                                                                                                                                                              |
| **Batch insert size**                         | Batch size for bulk insert operations. Default: `100` rows.                                                                                                                                                                                                                                                                                                                                                |
| **Truncate collection**                       | Deletes existing data in the target collection before inserting new data.                                                                                                                                                                                                                                                                                                                                  |
| **Update**                                    | Sets the update write method. **Upsert** and **Modifier update** require an update field configuration (see **Mongo document fields**).                                                                                                                                                                                                                                                                    |
| **Upsert**                                    | Changes the write method from insert to upsert. A matched document is replaced based on incoming fields defined in **Mongo document fields**. If no match exists, PDI inserts a new document.                                                                                                                                                                                                              |
| **Multi-update**                              | Updates all matching documents for each update or upsert operation.                                                                                                                                                                                                                                                                                                                                        |
| **Modifier update**                           | Enables modifier operators (for example, `$set`) to update individual fields within matching documents. To update more than one matching document, select **Modifier update** and **Upsert**. Selecting **Modifier update**, **Upsert**, and **Multi-update** applies updates to all matching documents (instead of just the first).                                                                       |
| **Number of retries for write operations**    | Number of retry attempts for write operations.                                                                                                                                                                                                                                                                                                                                                             |
| **Delay, in seconds, between retry attempts** | Seconds to wait between retries.                                                                                                                                                                                                                                                                                                                                                                           |
| **Write concern (w option)**                  | Minimum number of servers that must confirm the write. Values include: `-1` (disable acknowledgement), `0` (disable basic acknowledgement but return socket/network errors), `1` (acknowledge on primary), and values greater than `1` (wait for the specified number of members, including the primary). Select **Get custom write concerns** to retrieve custom write concerns stored in the repository. |
| **w Timeout**                                 | Time (ms) to wait for write acknowledgement. Leave blank for no timeout.                                                                                                                                                                                                                                                                                                                                   |
| **Journaled writes**                          | Waits until `mongod` acknowledges the write operation and commits data to the journal.                                                                                                                                                                                                                                                                                                                     |
| **Read preference**                           | Node selection preference: `Primary`, `Primary preferred`, `Secondary`, `Secondary preferred`, or `Nearest`. Default is `Primary`. This option is available when **Modifier update** is selected.                                                                                                                                                                                                          |

#### Mongo document fields tab

Use the **Mongo document fields** tab to define how incoming fields are written to a MongoDB document.

![MongoDB Output - Mongo document fields tab](https://773338310-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYwnJ6Fexn4LZwKRHghPK%2Fuploads%2Fgit-blob-e1e8a8cb686fff3b6ba08f20d4d5b2f0ee12cde0%2FPDI_TransStep_MongoDB_Output_Mongo_Document_Fields_Tab.png?alt=media)

The **Modifier policy** column controls when a modifier operation affects a field. This is useful when data for a single MongoDB document is split across multiple incoming rows or when you cannot apply different modifier operations to the same field at the same time.

| Column                         | Description                                                                                                                                                                                                            |
| ------------------------------ | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Name**                       | Incoming field name.                                                                                                                                                                                                   |
| **Mongo document path**        | Hierarchical path (dot notation) to the field in the MongoDB document.                                                                                                                                                 |
| **Use field name**             | Whether to use the incoming field name as the final entry in the path. Values: `Y` (use incoming field name) or `N` (do not). When `Y`, the step assumes a preceding `.`.                                              |
| **NULL values**                | Whether to insert null values. Values: **Insert NULL** or **Ignore**.                                                                                                                                                  |
| **JSON**                       | Indicates the incoming value is a JSON document.                                                                                                                                                                       |
| **Match field for update**     | When performing an upsert, PDI matches documents using fields marked `Y` in this column. The first matching document is updated/replaced (depending on configuration). If no match exists, PDI inserts a new document. |
| **Modifier operation**         | In-place modifications for existing fields. Supported modifiers include **N/A**, **$set**, **$inc**, **$push**, and **$** (positional operator for arrays).                                                            |
| **Modifier policy**            | When a modifier applies: **Insert\&Update** (default), **Insert** (only on insert), or **Update** (only on update).                                                                                                    |
| **Get fields**                 | Populates the **Name** column from incoming fields.                                                                                                                                                                    |
| **Preview document structure** | Shows the JSON structure that will be written to MongoDB.                                                                                                                                                              |

<details>

<summary>Example: document structure and field definitions</summary>

Input data:

```
first, last, address, age
Bob, Jones ,"13 Bob Street", 34
Fred, Flintstone, "10 Rock Street",50
Zaphod, Beeblebrox, "Beetlejuice 1", 356
Noddy,Puppet,"Noddy Land",5
```

Document field definitions:

| Name    | Mongo document path | Use field name | JSON | Match field for update | Modifier operation | Modifier policy |
| ------- | ------------------- | -------------- | ---- | ---------------------- | ------------------ | --------------- |
| first   | top1                | Y              | N    | N                      | N/A                | Insert\&Update  |
| last    | array\[0]           | Y              | N    | N                      | N/A                | Insert\&Update  |
| address | array\[0]           | Y              | N    | N                      | N/A                | Insert\&Update  |
| age     | array\[0]           | Y              | N    | N                      | N/A                | Insert\&Update  |

Resulting structure:

```javascript
{
  "top1" : {
    "first" : "<string val>"
  },
  "array" : [ { "last" : "<string val>", "address" : "<string val>" } ],
  "age" : "<integer val>"
}
```

</details>

#### Create/drop indexes tab

Use the **Create/drop indexes** tab to create or drop MongoDB indexes on one or more fields.

Unless unique indexes are used, MongoDB allows duplicate records to be inserted. Indexing is performed after all rows are processed by the step.

![MongoDB Output - Create/drop indexes tab](https://773338310-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYwnJ6Fexn4LZwKRHghPK%2Fuploads%2Fgit-blob-250815400218c9dc43c3c0ca3a0d108468c8de4f%2FPDI_TransStep_MongoDB_Output_Create-Drop_Indexes_Tab.png?alt=media)

| Field            | Description                                                                                                                                                                                     |
| ---------------- | ----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| **Index fields** | Single-field or compound index. For compound indexes, use a comma-separated list of paths. Use dot notation for nested fields. You can specify direction: `1` (ascending) or `-1` (descending). |
| **Index op**     | Whether to create or drop an index.                                                                                                                                                             |
| **Unique**       | Index only unique field values.                                                                                                                                                                 |
| **Sparse**       | Index only documents that include the indexed field.                                                                                                                                            |
| **Show indexes** | Displays existing indexes.                                                                                                                                                                      |

<details>

<summary>Example: compound index</summary>

The following example shows a compound index for the `first` and `age` fields in ascending order:

![MongoDB Output - Create/drop indexes example](https://773338310-files.gitbook.io/~/files/v0/b/gitbook-x-prod.appspot.com/o/spaces%2FYwnJ6Fexn4LZwKRHghPK%2Fuploads%2Fgit-blob-abba8c492fc645671a288d003cdd0cbae7246777%2FPDI_TransStep_MongoDB_Output_Index_example.png?alt=media)

</details>

### Metadata injection support

All fields of this step support metadata injection. You can use it with [ETL metadata injection](https://docs.pentaho.com/pdia-data-integration/pdi-transformation-steps-reference-overview/etl-metadata-injection) to pass metadata to your transformation at runtime.
