metalus

This project aims to make writing Spark applications easier by abstracting the effort to assemble the driver into reusable steps and pipelines.

View project on GitHub

Metalus Pipeline Examples

This project contains examples related to using the different features of the framework. All examples present some form of coding exercise as a way of teaching the basic requirements, however using the Application framework and metalus-common project, it may be entirely possible to write a complete application without the need to build a jar or create new steps.

  • Application Example - This example uses the Application framework to build the example. It is similar to the Execution Plan Example, but does not require building the DriverSetup.
  • Application Example from Metadata in Jar - This example uses the Application framework pointing to a pipeline stored in the metalus-common metadata. It uses a step-group to load data from an SFTP site into and HDFS “bronze” datastore (parquet).
  • Basic ETL Example - This example demonstrates reading in data from a file, processing the data and then writing the processed data back to disk.
  • Execution Plan Example - This example explores the execution plan functionality.
  • Kinesis Streaming Example - This example demonstrates how to process data from Kinesis.