Skip to content

Latest commit

 

History

History
75 lines (44 loc) · 3.66 KB

spark-submit-job.md

File metadata and controls

75 lines (44 loc) · 3.66 KB
title titleSuffix description author ms.author ms.reviewer ms.date ms.service ms.subservice ms.topic
Submit Spark jobs: Azure Data Studio
SQL Server Big Data Clusters
Submit Spark jobs on SQL Server big data cluster in Azure Data Studio.
jejiang
jejiang
wiassaf
12/13/2019
sql
big-data-cluster
conceptual

Submit Spark jobs on [!INCLUDEbig-data-clusters-2019] in Azure Data Studio

[!INCLUDESQL Server 2019]

[!INCLUDEbig-data-clusters-banner-retirement]

One of the key scenarios for big data clusters is the ability to submit Spark jobs for SQL Server. The Spark job submission feature allows you to submit a local Jar or Py files with references to SQL Server 2019 big data cluster. It also enables you to execute a Jar or Py files, which are already located in the HDFS file system.

Prerequisites

Open Spark job submission dialog

There are several ways to open the Spark job submission dialog. The ways include Dashboard, Context Menu in Object Explorer, and Command Palette.

  • To open the Spark job submission dialog, click New Spark Job in the dashboard.

    Submit menu by clicking dashboard

  • Or right-click on the cluster in Object Explorer and select Submit Spark Job from the context menu.

    Submit menu by right-click file

  • To open the Spark job submission dialog with the Jar/Py fields pre-populated, right-click on a Jar/Py file in the Object Explorer and select Submit Spark Job from the context menu.

    Submit menu by right-click cluster

  • Use Submit Spark Job from the command palette by typing Ctrl+Shift+P (in Windows) and Cmd+Shift+P (in Mac).

    Submit menu command palette in Windows

Submit Spark job

The Spark job submission dialog is displayed as the following. Enter Job name, JAR/Py file path, main class, and other fields. The Jar/Py file source could be from Local or from HDFS. If the Spark job has reference Jars, Py files or additional files, click the ADVANCED tab and enter the corresponding file paths. Click Submit to submit Spark job.

New spark job dialog

Advanced dialog

Monitor Spark job submission

After the Spark job is submitted, the Spark job submission and execution status information are displayed in the Task History on the left. Details on the progress and logs are also displayed in the OUTPUT window at the bottom.

  • When the Spark job is in progress, the Task History panel and OUTPUT window refresh with the progress.

    Monitor spark job in progress

  • When the Spark job successfully completes, the Spark UI and Yarn UI links appear in the OUTPUT window. Click the links for more information.

    Spark job link in output

Next steps

For more information on SQL Server big data cluster and related scenarios, see [Introducing [!INCLUDEbig-data-clusters-2019]](big-data-cluster-overview.md).