> For the complete documentation index, see [llms.txt](https://docs.pentaho.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://docs.pentaho.com/pdia-data-integration/9.3-data-integration/pdi-transformation-steps-reference-overview/cassandra-input/options-cassandra-input/cql-select-query.md).

# CQL SELECT query

Cassandra is a sparse column oriented database similar to HBase. It is possible for rows to contain varying numbers of columns which might or might not be defined in the metadata for the table (column family). The Cassandra Input step can emit columns that are not defined in the metadata for the table in question if they are explicitly named in the `SELECT` clause. Cassandra Input uses type information present in the metadata for a table. This, at a minimum, includes a default type (column validator) for the table. If there is explicit metadata for individual columns available, then this is used for type information, otherwise the default validator is used.

**Important:** Cassandra Input does not support the CQL range notation, for instance `name1..nameN`, for specifying columns in a `SELECT` query.

You can enter your CQL `SELECT` statement for querying the table in the large text box at the bottom of the dialog box. Only a single `SELECT` query is accepted by the step. The following example query shows the possible format of the statement:

```sql
SELECT [FIRST N] [REVERSED] <SELECT EXPR> FROM <TABLE> [USING <CONSISTENCY>] [WHERE <CLAUSE>] [LIMIT N];
```

`SELECT` queries may name columns explicitly (in a comma separated list) or use the `*` wildcard. If you use the `*` wildcard, then only those columns defined in the metadata for the table in question are returned. If columns are selected explicitly, then the name of each column must be enclosed in single quotation marks.

The following table describes the elements of the CQL `SELECT` statement:

| Element        | Description                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                                  |
| -------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `FIRST N`      | Returns the first `N` (where `N` is determined by the column sorting strategy used for the table in question) column values from each row. If the table in question is sparse then it may result in a different `N` (or less) column values appearing from one row to the next. Because PDI deals with a constant number of fields between steps in a transformation, Cassandra rows that do not contain particular columns are output as rows with null field values for non-existent columns. Cassandra's default for `FIRST` (if omitted from the query) is 10,000 columns. If a query is expected to return more than 10,000 columns, then an explicit FIRST must be added to the query. |
| `REVERSED`     | Reverses the sort order of the columns returned by Cassandra for each row. It may affect which values result from a `FIRST N` option, but does not affect the order of the columns output by Cassandra Input.                                                                                                                                                                                                                                                                                                                                                                                                                                                                                |
| `WHERE` clause | <p>Filters the rows that appear in results. The clause can filter on any of the following factors:</p><ul><li>A key name</li><li>Range of keys</li><li>Column values in the case of indexed columns</li></ul><p>Key filters are specified using the <code>KEY</code> keyword, a relational operator (one of =, >, >=, <, and <=), and a term value.</p>                                                                                                                                                                                                                                                                                                                                      |
| `LIMIT`        | Limits the number of rows returned. If the query is expected to return more than 10,000 rows, an explicit `LIMIT` clause must be added to the query. If omitted, Cassandra assumes a default limit of 10,000 rows to be returned by the query.                                                                                                                                                                                                                                                                                                                                                                                                                                               |


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/9.3-data-integration/pdi-transformation-steps-reference-overview/cassandra-input/options-cassandra-input/cql-select-query.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
