metalus

This project aims to make writing Spark applications easier by abstracting the effort to assemble the driver into reusable steps and pipelines.

View project on GitHub

Documentation Home

Write DataFrame to HDFS Parquet

This step group pipeline will write a DataFrame an HDFS location. This step group is designed to work with the LoadToParquet pipeline.

General Information

Id: 189328f0-c2c7-11eb-928b-3dca5c59af1b

Name: WriteDataFrameToHDFS

Required Parameters

Required parameters indicated with a *:

bronzeZonePath * - The HDFS path for the root bronze zone folder.
fileId * - The unique id for the file that is processed.