Documentation Home | Connectors |
FileConnectors
File connectors are used to invoke FileManager implementations that can be used with the FileManagerSteps object. FileConnectors are used when a pipeline/step needs to work on a single file without a DataFrame. Most common operations are expected to be the Copy step and the Create FileManager step.
Parameters The following parameters are available to all file connectors:
- name - The name of the connector
- credentialName - The optional credential name to use to authenticate
- credential - The optional credential to use to authenticate
HDFSFileConnector
This connector provides access to the HDFS file system. The credentialName and credential parameters are not used in this implementation, instead relying on the permissions of the cluster. Below is an example setup:
Scala
val connector = HDFSFileConnector("my-hdfs-connector", None, None)
Globals JSON
{
"myHdfsConnector": {
"className": "com.acxiom.pipeline.connectors.HDFSFileConnector",
"object": {
"name": "my-connector"
}
}
}
SFTPFileManager
This connector provides access to an SFTP server. In addition to the standard parameters, the following parameters are available:
- hostName - The host name of the SFTP resource
- port - The optional SFTP port
- knownHosts - The optional file path to the known_hosts file
- bulkRequests - The optional number of requests that may be sent at one time
- config - Optional config options
- timeout - Optional connection timeout
Below is an example setup:
Scala
val connector = SFTPFileConnector("sftp.myhost.com", "my-sftp-connector", None, None)
Globals JSON
{
"sftpConnector": {
"className": "com.acxiom.pipeline.connectors.SFTPFileConnector",
"object": {
"name": "my-connector",
"hostName": "sftp.myhost.com"
}
}
}
S3FileManager
This connector provides access to the S3 file system. In addition to the standard parameters, the following parameters are available:
- region - The AWS region
- bucket - The S3 bucket
Below is an example setup:
Scala
val connector = S3FileConnector("us-east-1", "my-bucket", "my-connector", Some("my-credential-name-for-secrets-manager"), None)
Globals JSON
{
"connector": {
"className": "com.acxiom.aws.pipeline.connectors.S3FileConnector",
"object": {
"name": "my-connector",
"region": "us-east-1",
"bucket": "my-bucket",
"credentialName": "my-credential-name-for-secrets-manager"
}
}
}
GCSFileManager
This connector provides access to the S3 file system. In addition to the standard parameters, the following parameters are available:
- projectId - The project id of the GCS project
- bucket - The name of the GCS bucket
Below is an example setup:
Scala
val connector = GCSFileConnector("my-dev-project", "my-bucket", "my-connector", Some("my-credential-name-for-secrets-manager"), None)
Globals JSON
{
"connector": {
"className": "com.acxiom.gcp.pipeline.connectors.GCSFileConnector",
"object": {
"name": "my-connector",
"projectId": "my-dev-project",
"bucket": "my-bucket",
"credentialName": "my-credential-name-for-secrets-manager"
}
}
}