Name		Name	Last commit message	Last commit date
parent directory ..
images		images
README.md		README.md
Tx_Scala_UDF.zip		Tx_Scala_UDF.zip

README.md

Tx Scala UDF

Important: These instructions assume you have access to StreamSets Transformer

For help installing StreamSets Transformer, see StreamSets Transformer Installation.

Here is a link to a short video on using this pipeline template: Video Link

OVERVIEW

This pipeline demonstrates how to create, register, and use a User-Defined Function in Scala using StreamSets Transformer.

The source data for this pipeline is included in the Dev Raw Data Source as an example. Typically, you would replace these with your actual source data (JDBC/Files/etc...). This template writes data to a file on the local file system, but you would typically replace this with your actual destination.

Disclaimer: This pipeline is meant to serve as a template for creating, registering and using a User-Defined Function in Scala

USING THE TEMPLATE

NOTE: Templates are supported in StreamSets Control Hub. If you do not have Control Hub, you can import the template pipeline in Data Collector but will need to do that each time you want to use the template.

PIPELINE

Pipeline Description with links to documentation

Stage	Description
`Dev Raw Data Source`	Generates records based on user-supplied data
Create UDFs	Creates a small example function and registers it with SparkSQL as a column function
Use UDF	Leverages created UDF as a SparkSQL Expression Function
Write udf	Writes data to a local file system

STEP-BY-STEP

Step 1: Download the pipeline

Click Here to download the pipeline and save it to your drive.

Step 2: Import the pipeline

Click the down arrow next to the "Create New Pipeline" and select "Import Pipeline From Archive".

Click "Browse" and locate the pipeline file you just downloaded, click "OK", then click "Import"

Step 3: Configure the parameters

Click on the pipeline you just imported to open it and click on the "Parameters" tab and fill in the appropriate information for your environment.

Important: For this pipeline, you only need to specify the output directory for the file. This is on the local file system where Transformer is installed. Make sure the directory is created and proper permissions are set so that the transformer user can create files. By default, the directory /data/udf is used. You can change it to anything you want.

The following parameters are set up for this pipeline:

destination_directory

Path to the directory for the output files.

Use the following format:

/<directory>

Step 4: Run the pipeline

Click the "START" button to run the pipeline.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tx Scala UDF

Tx Scala UDF

README.md

Tx Scala UDF

Tx Scala UDF

OVERVIEW

USING THE TEMPLATE

PIPELINE

Pipeline Description with links to documentation