# Hadoop Copy Files

The **Hadoop Copy Files** job entry copies files in a Hadoop cluster from one location to another.

### General

* **Entry name**: Specify the unique name of the Hadoop Copy Files entry on the canvas. You can customize the name or leave it as the default.

### Options

The Hadoop Copy Files job entry includes two tabs: **Files/Folders** and **Settings**.

#### Files/Folders tab

![Files/Folders tab, Hadoop Copy Files](/files/yIzhLqCOEdlgYiyhLwW2)

| Option                      | Description                                                                                                                                                                                                                                                                                      |
| --------------------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------ |
| **Source Environment**      | Specify the type of file system containing the files you want to copy.                                                                                                                                                                                                                           |
| **Source File/Folder**      | Specify the file or directory you want to copy. Click **Browse** to navigate to the source file or folder through the VFS browser. See [VFS browser](/pdia-data-integration/archived-merged-pages/connecting-to-virtual-file-systems-archive/vfs-browser-connecting-to-virtual-file-systems.md). |
| **Wildcard (RegExp)**       | Specify the files to copy by using a regular expression instead of static file names. For example, `.*\.txt` selects all files with a `.txt` extension.                                                                                                                                          |
| **Destination Environment** | Specify the file system where you want to put your copied files.                                                                                                                                                                                                                                 |
| **Destination File/Folder** | Specify the file or directory where you want to place your copied file. Click **Browse** and select **Hadoop** to enter your Hadoop cluster connection details.                                                                                                                                  |

{% hint style="info" %}
The source environment and destination environment must be the same.
{% endhint %}

#### Settings tab

![Settings tab, Hadoop Copy Files](/files/KDNsw23fVeU9zFxnVJmM)

| Option                                 | Description                                                                                            |
| -------------------------------------- | ------------------------------------------------------------------------------------------------------ |
| **Include subfolders**                 | Select to copy all subdirectories in the chosen directory.                                             |
| **Destination is a file**              | Select if the destination is a file.                                                                   |
| **Copy empty folders**                 | Select to copy empty directories. **Include subfolders** must be selected for this option to be valid. |
| **Create destination folder**          | Select to create the destination directory if it does not exist.                                       |
| **Replace existing files**             | Select to overwrite files in the destination directory.                                                |
| **Remove source files**                | Select to remove the source files after copying them. This is equivalent to a move operation.          |
| **Copy previous results to arguments** | Select to use previous step results as your sources and destinations.                                  |
| **Add files to result files name**     | Select to create a list of the files copied by this entry.                                             |

If you are not using Kerberos security, this job entry sends the username of the signed-in user when copying files, regardless of the username entered in the connection field.

To use a different username, set the `HADOOP_USER_NAME` environment variable to the username you want.

Example:

```
OPT="$OPT .... -DHADOOP_USER_NAME=HadoopNameToSpoof"
```


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://docs.pentaho.com/pdia-data-integration/pdi-job-entries-reference-overview/hadoop-copy-files.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
