metalus

This project aims to make writing Spark applications easier by abstracting the effort to assemble the driver into reusable steps and pipelines.

View project on GitHub

Documentation Home

Driver Utilities

The driver utilities object provides helper functions that are useful when extending the base Metalus functionality.

Create Spark Conf

This function is used to create a new SparkConf and register classes that need to be serialized.

Extract Parameters

This function takes the arguments array from the entry point of the driver class and creates a map of parameters. This function will look for parameters starting with two dashes (–) and grab the value. Additionally this function will validate required parameters.

Get Http Rest Client

This function will take a URL and a map of parameters (provided by extract parameters) to initialize an HttpRestClient. This function handles parsing the authorization parameters as well.

Validate Required Parameters

Given a parameter map and a list of required parameters, this function will validate that all required parameters exist in the map. A RuntimeException will be thrown indicating all missing parameters.

Parse Pipeline JSON

Given a JSON string, this function will convert it to a Pipeline object.

Parse JSON

This function will parse a JSON string into a valid object.

Load JSON From File

This function will load a JSON from a file. It will attempt to determine the FileManager based on the fileLoaderClassName parameter.

Add Initial DataFrame to Execution Plan

This function is used by streaming drivers to inject the DataFrame created from the stream into the execution plan.

Handle execution results

This function will parse the results and throw any exception that is present in the error field. Any result that is not successful, but is paused will return true. Any other non-successful executions will result in a false being returned with the execution and pipeline ids.