Skip to content
This repository was archived by the owner on Oct 12, 2023. It is now read-only.

Commit 4d3c520

Browse files
authored
Update AzureDocument.md
1 parent ad5f4ae commit 4d3c520

File tree

1 file changed

+16
-16
lines changed

1 file changed

+16
-16
lines changed

docs/AzureDocument.md

Lines changed: 16 additions & 16 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,6 @@
1-
# Accelerate real-time big data analytics with Spark connector for Azure SQL Databases and SQL Server
1+
# Accelerate real-time big data analytics with Spark connector for Azure SQL Database and SQL Server
22

3-
The Spark connector for Azure SQL Databases and SQL Server enables SQL databases, including Azure SQL Databases and SQL Server, to act as input data source or output data sink for Spark jobs. It allows you to utilize real time transactional data in big data analytics and persist results for adhoc queries or reporting. Comparing to the built-in JDBC connector, this connector provides the ability to bulk insert data into SQL databases. It can outperform row by row insertion with 10x to 20x faster performance. The Spark connector for Azure SQL Databases and SQL Server also supports AAD authentication. It allows you securely connecting to your Azure SQL databases from Azure Databricks using your AAD account. It provides similar interfaces with the built-in JDBC connector. It is easy to migrate your existing Spark jobs to use this new connector.
3+
The Spark connector for Azure SQL Database and SQL Server enables SQL databases, including Azure SQL Database and SQL Server, to act as input data source or output data sink for Spark jobs. It allows you to utilize real time transactional data in big data analytics and persist results for adhoc queries or reporting. Compared to the built-in JDBC connector, this connector provides the ability to bulk insert data into SQL databases. It can outperform row by row insertion with 10x to 20x faster performance. The Spark connector for Azure SQL Database and SQL Server also supports AAD authentication. It allows you securely connecting to your Azure SQL database from Azure Databricks using your AAD account. It provides similar interfaces with the built-in JDBC connector. It is easy to migrate your existing Spark jobs to use this new connector.
44

55
## Download
66
To get started, download the Spark to SQL DB connector from the [azure-sqldb-spark repository](https://github.com/Azure/azure-sqldb-spark) on GitHub.
@@ -13,14 +13,14 @@ To get started, download the Spark to SQL DB connector from the [azure-sqldb-spa
1313
| Scala |2.10 or later |
1414
| Microsoft JDBC Driver for SQL Server |6.2 or later |
1515
| Microsoft SQL Server |SQL Server 2008 or later |
16-
| Azure SQL Databases |Supported |
16+
| Azure SQL Database |Supported |
1717

18-
The Spark connector for Azure SQL Databases and SQL Server utilizes the Microsoft JDBC Driver for SQL Server to move data between Spark worker nodes and SQL databases:
18+
The Spark connector for Azure SQL Database and SQL Server utilizes the Microsoft JDBC Driver for SQL Server to move data between Spark worker nodes and SQL databases:
1919

2020
The dataflow is as following:
21-
1. The Spark master node connect to SQL Server or Azure SQL Databases and load data from a specific table or using a specific SQL query
21+
1. The Spark master node connect to SQL Server or Azure SQL Database and load data from a specific table or using a specific SQL query
2222
2. Spark master node distribute data to worker nodes for transformation.
23-
3. Worker node connect to SQL Server or Azure SQL Databases and write data to the database. User can choose to use row-by-row insertion or bulk insert.
23+
3. Worker node connect to SQL Server or Azure SQL Database and write data to the database. User can choose to use row-by-row insertion or bulk insert.
2424

2525
### Build the Spark to SQL DB connector
2626
Currently, the connector project uses maven. To build the connector without dependencies, you can run:
@@ -29,9 +29,9 @@ You can also download the latest versions of the JAR from the release folder
2929
Include the SQL DB Spark JAR
3030

3131
## Connect Spark to SQL DB using the connector
32-
You can connect to Azure SQL Databases or SQL Server from Spark jobs, read or write data. You can also run a DML or DDL query in an Azure SQL database or SQL Server database.
32+
You can connect to Azure SQL Database or SQL Server from Spark jobs, read or write data. You can also run a DML or DDL query in an Azure SQL database or SQL Server database.
3333

34-
### Read data from Azure SQL Databases or SQL Server
34+
### Read data from Azure SQL Database or SQL Server
3535

3636
```scala
3737
import com.microsoft.azure.sqldb.spark.config.Config
@@ -50,7 +50,7 @@ val config = Config(Map(
5050
val collection = sqlContext.read.sqlDB(config)
5151
collection.show()
5252
```
53-
### Reading data from Azure SQL Databases or SQL Server with specified SQL query
53+
### Reading data from Azure SQL Database or SQL Server with specified SQL query
5454
```scala
5555
import com.microsoft.azure.sqldb.spark.config.Config
5656
import com.microsoft.azure.sqldb.spark.connect._
@@ -68,7 +68,7 @@ val collection = sqlContext.read.sqlDb(config)
6868
collection.show()
6969
```
7070

71-
### Write data to Azure SQL Databases or SQL Server
71+
### Write data to Azure SQL Database or SQL Server
7272
```scala
7373
import com.microsoft.azure.sqldb.spark.config.Config
7474
import com.microsoft.azure.sqldb.spark.connect._
@@ -87,7 +87,7 @@ import org.apache.spark.sql.SaveMode
8787
collection.write.mode(SaveMode.Append).sqlDB(config)
8888
```
8989

90-
### Run DML or DDL query in Azure SQL Databases or SQL Server
90+
### Run DML or DDL query in Azure SQL Database or SQL Server
9191
```scala
9292
import com.microsoft.azure.sqldb.spark.config.Config
9393
import com.microsoft.azure.sqldb.spark.query._
@@ -108,8 +108,8 @@ val config = Config(Map(
108108
sqlContext.SqlDBQuery(config)
109109
```
110110

111-
## Connect Spark to Azure SQL Databases using AAD authentication
112-
You can connect to Azure SQL Databases using Azure Active Directory (AAD) authentication. Use AAD authentication to centrally manage identities of database users and as an alternative to SQL Server authentication.
111+
## Connect Spark to Azure SQL Database using AAD authentication
112+
You can connect to Azure SQL Database using Azure Active Directory (AAD) authentication. Use AAD authentication to centrally manage identities of database users and as an alternative to SQL Server authentication.
113113
### Connecting using ActiveDirectoryPassword Authentication Mode
114114
#### Setup Requirement
115115
If you are using the ActiveDirectoryPassword authentication mode you will need to download [azure-activedirectory-library-for-java](https://github.com/AzureAD/azure-activedirectory-library-for-java) and its dependencies, and include them in the Java build path.
@@ -153,8 +153,8 @@ val collection = sqlContext.read.SqlDB(config)
153153
collection.show()
154154
```
155155

156-
## Write data to Azure SQL databases or SQL Server using Bulk Insert
157-
The traditional jdbc connector writes data into Azure SQL databases or SQL Server using row-by-row insertion. You can use Spark to SQL DB connector to write data to SQL database using bulk insert. It will significantly improve the write performance when loading large data sets or loading data into tables where column store index is used.
156+
## Write data to Azure SQL database or SQL Server using Bulk Insert
157+
The traditional jdbc connector writes data into Azure SQL database or SQL Server using row-by-row insertion. You can use Spark to SQL DB connector to write data to SQL database using bulk insert. It will significantly improve the write performance when loading large data sets or loading data into tables where column store index is used.
158158

159159
```scala
160160
import com.microsoft.azure.sqldb.spark.bulkcopy.BulkCopyMetadata
@@ -188,7 +188,7 @@ df.bulkCopyToSqlDB(bulkCopyConfig, bulkCopyMetadata)
188188
```
189189

190190
## Next steps
191-
If you haven't already, download the Spark connector for Azure SQL Databases and SQL Server from [azure-sqldb-spark GitHub repository](https://github.com/Azure/azure-sqldb-spark) and explore the additional resources in the repo:
191+
If you haven't already, download the Spark connector for Azure SQL Database and SQL Server from [azure-sqldb-spark GitHub repository](https://github.com/Azure/azure-sqldb-spark) and explore the additional resources in the repo:
192192

193193
- [Sample Azure Databricks notebooks](https://github.com/Azure/azure-sqldb-spark/tree/master/samples/notebooks)
194194
- [Sample scripts (Scala)](https://github.com/Azure/azure-sqldb-spark/tree/master/samples/scripts)

0 commit comments

Comments
 (0)