# JDBC tuning options

Spark is a massive parallel computation system that can run on many nodes, processing hundreds of partitions at a time, but when working with SQL databases, you may want to customize processing to reduce the risk of failure. You can specify the number of concurrent JDBC connections, numeric column names, minimal value to read, and maximum value to read. Spark then reads data from the JDBC partitioned by a specific column and partitions the data by the specified numeric column, producing parallel queries when applied correctly. If you have a cluster installed with Hive, the JDBC tuning options can improve transformation performance.

The **read-jdbc** parameter constructs a `DataFrame` representing the database table accessible via a JDBC URL named table. Partitions of the table are retrieved in parallel based on the parameters passed to this function. See the [Spark API documentation](https://spark.apache.org/docs/2.3.0/api/java/org/apache/spark/sql/DataFrameReader.html#jdbc-java.lang.String-java.lang.String-java.lang.String-long-long-int-java.util.Properties-) for more information.

| Option                      | Description                                                                                                                                                                                                                                                         | Value type | Example value |
| --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | ---------- | ------------- |
| **read.jdbc.columnName**    | The name of a column of integral type that will be used for partitioning.                                                                                                                                                                                           | String     | column1       |
| **read.jdbc.lowerBound**    | The minimum value of `columnName` used to decide partition stride. This option works with **read.jdbc.columnName**.                                                                                                                                                 | Any value  |               |
| **read.jdbc.upperBound**    | The maximum value of `columnName` used to decide partition stride. This option works with **read.jdbc.columnName**.                                                                                                                                                 | Any value  |               |
| **read.jdbc.numPartitions** | The number of partitions. This, along with `lowerBound` (inclusive), `upperBound` (exclusive), form partition strides for generated WHERE clause expressions used to split the column `columnName` evenly. When the input is less than 1, the number is set to `1`. | Integer    | 5             |


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/install/9.3-install/pentaho-configuration/tasks-to-be-performed-by-an-it-administrator/set-up-the-adaptive-execution-layer-ael/advanced-topics/spark-tuning-landing-page-cp/jdbc-tuning-options-spark.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
