Connecting to Virtual File Systems

You can connect to most Virtual File Systems (VFS) through VFS connections in PDI. A VFS connection is a stored set of VFS properties that you can use to connect to a specific file system. In PDI, you can add a VFS connection and then reference that connection whenever you want to access files or folders on your Virtual File System. For example, you can use the VFS connection for Hitachi Content Platform (HCP) in any of the HCP transformation steps without the need to repeatedly enter your credentials for data access.

With a VFS connection, you can set your VFS properties with a single instance that can be used multiple times. The VFS connection supports the following file systems:

  • Amazon S3/Minio/HCP

  • Azure Data Lake Gen 1

    Accesses data objects on Microsoft Azure Gen 1 storage services. You must create an Azure account and configure Azure Data Lake Storage Gen 1. See Access to Microsoft Azure for more information.

    Note: Support for Azure Data Lake Gen 1 is discontinued and limited to users with existing Gen 1 accounts. As a best practice, use Azure Data Lake Storage Gen 2. See Azure for details.

  • Azure Data Lake Gen 2/Blob

    Accesses data objects on Microsoft Azure Gen 2 or Blob storage services. . You must create an Azure account and configure Azure Data Lake Storage Gen 2 and Blob Storage. See Access to Microsoft Azure for more information.

  • Google Cloud Storage

    Accesses data in the Google Cloud Storage file system. See Google Cloud Storage for more information on this protocol.

  • HCP REST

    Accesses data in the Hitachi Content Platform. You must configure HCP and PDI before accessing the platform. See Access to HCP REST for more information.

  • Local

    Accesses data in your local physical file system.

  • SMB/UNC Provider

    Accesses data in a Windows platform that uses the Server Message Block (SMB) protocol and Universal Naming Convention (UNC) string to specify the resource location path.

  • Snowflake Staging

    Accesses a staging area used by Snowflake to load files. See Snowflake staging area for more information on this protocol.

After you create a VFS connection, you can use it with PDI steps and entries that support the use of VFS connections. If you are connected to a repository, the VFS connection is saved in the repository. If you are not connected to a repository, the connection is saved locally on the machine where it was created.

If a VFS connection in PDI is not available for your Virtual File System, you may be able to access it with the VFS browser. See VFS browser for further details.

Last updated

Was this helpful?