Skip to content
This repository was archived by the owner on Oct 12, 2023. It is now read-only.

Commit 4e9b601

Browse files
author
Zeqi Cui
authored
Updated README.md
To most recent version of documentation
1 parent f3a538f commit 4e9b601

File tree

1 file changed

+42
-29
lines changed

1 file changed

+42
-29
lines changed

README.md

Lines changed: 42 additions & 29 deletions
Original file line numberDiff line numberDiff line change
@@ -1,11 +1,8 @@
1-
# Azure SQL Database Spark Connector
1+
# Spark connector for Azure SQL Databases and SQL Server
22

3-
The official connector for Spark and Azure SQL Database.
4-
5-
This project provides a client library that allows your Azure SQL Database to be an input source or output sink for SparkJobs
6-
7-
## Requirements
3+
The Spark connector for [Azure SQL Database](https://azure.microsoft.com/en-us/services/sql-database/) and [SQL Server](https://www.microsoft.com/en-us/sql-server/default.aspx) enables SQL databases, including Azure SQL Databases and SQL Server, to act as input data source or output data sink for Spark jobs. It allows you to utilize real time transactional data in big data analytics and persist results for adhoc queries or reporting.
84

5+
Comparing to the built-in Spark connector, this connector provides the ability to bulk insert data into SQL databases. It can outperform row by row insertion with 10x to 20x faster performance. The Spark connector for Azure SQL Databases and SQL Server also supports AAD authentication. It allows you securely connecting to your Azure SQL databases from Azure Databricks using your AAD account. It provides similar interfaces with the built-in JDBC connector. It is easy to migrate your existing Spark jobs to use this new connector.
96

107
## How to connect to Spark using this library
118
This connector uses Microsoft SQLServer JDBC driver to fetch data from/to the Azure SQL Database.
@@ -16,7 +13,8 @@ All connection properties in
1613
Microsoft JDBC Driver for SQL Server
1714
</a> are supported in this connector. Add connection properties as fields in the `com.microsoft.azure.sqldb.spark.config.Config` object.
1815

19-
### Reading from Azure SQL Database using Scala
16+
17+
### Reading from Azure SQL Database or SQL Server
2018
```scala
2119
import com.microsoft.azure.sqldb.spark.config.Config
2220
import com.microsoft.azure.sqldb.spark.connect._
@@ -31,12 +29,12 @@ val config = Config(Map(
3129
"queryTimeout" -> "5" //seconds
3230
))
3331

34-
val collection = sqlContext.read.azureSQL(config)
32+
val collection = sqlContext.read.sqlDB(config)
3533
collection.show()
3634

3735
```
3836

39-
### Writing to Azure SQL Database using Scala
37+
### Writing to Azure SQL Database or SQL Server
4038
```scala
4139
import com.microsoft.azure.sqldb.spark.config.Config
4240
import com.microsoft.azure.sqldb.spark.connect._
@@ -52,10 +50,10 @@ val config = Config(Map(
5250
))
5351

5452
import org.apache.spark.sql.SaveMode
55-
collection.write.mode(SaveMode.Append).azureSQL(config)
53+
collection.write.mode(SaveMode.Append).sqlDB(config)
5654

5755
```
58-
### Pushdown query to Azure SQL Database using Scala
56+
### Pushdown query to Azure SQL Database or SQL Server
5957
For SELECT queries with expected return results, please use
6058
[Reading from Azure SQL Database using Scala](#reading-from-azure-sql-database-using-scala)
6159
```scala
@@ -77,7 +75,7 @@ val config = Config(Map(
7775

7876
sqlContext.azurePushdownQuery(config)
7977
```
80-
### Bulk Copy to Azure SQL Database / SQL Server using Scala
78+
### Bulk Copy to Azure SQL Database or SQL Server
8179
```scala
8280
import com.microsoft.azure.sqldb.spark.bulkcopy.BulkCopyMetadata
8381
import com.microsoft.azure.sqldb.spark.config.Config
@@ -98,7 +96,7 @@ val bulkCopyConfig = Config(Map(
9896
"databaseName" -> "MyDatabase",
9997
"user" -> "username",
10098
"password" -> "*********",
101-
"databaseName" -> "zeqisql",
99+
"databaseName" -> "MyDatabase",
102100
"dbTable" -> "dbo.Clients",
103101
"bulkCopyBatchSize" -> "2500",
104102
"bulkCopyTableLock" -> "true",
@@ -108,26 +106,41 @@ val bulkCopyConfig = Config(Map(
108106
df.bulkCopyToSqlDB(bulkCopyConfig, bulkCopyMetadata)
109107
//df.bulkCopyToSqlDB(bulkCopyConfig) if no metadata is specified.
110108
```
111-
## Active Directory / AccessToken authentication
112-
Simply specify your config authentication to connect using ActiveDirectory.
113-
If not specified, default authentication method is server authentication.
114109

115-
```scala
116-
val config = Config(Map(
117-
"url" -> "mysqlserver.database.windows.net",
118-
"databaseName" -> "MyDatabase",
119-
"user" -> "[email protected]",
120-
"password" -> "*********",
121-
"authentication" -> "ActiveDirectoryPassword",
122-
"trustServerCertificate" -> "true",
123-
"encrypt" -> "true"
124-
))
125-
126-
```
110+
## Requirements
111+
Official supported versions
127112

113+
| Component | Versions Supported |
114+
| --------- | ------------------ |
115+
| Apache Spark | 2.0.2 or later |
116+
| Scala | 2.10 or later |
117+
| Microsoft JDBC Driver for SQL Server | 6.2 or later |
118+
| Microsoft SQL Server | SQL Server 2008 or later |
119+
| Azure SQL Databases | Supported |
128120

129121
## Download
130-
131122
### Download from Maven
123+
*TBD*
132124

133125
### Build this project
126+
Currently, the connector project uses maven. To build the connector without dependencies, you can run:
127+
```sh
128+
mvn clean package
129+
```
130+
131+
## Contributing & Feedback
132+
133+
This project has adopted the [Microsoft Open Source Code of
134+
Conduct](https://opensource.microsoft.com/codeofconduct/). For more information
135+
see the [Code of Conduct
136+
FAQ](https://opensource.microsoft.com/codeofconduct/faq/) or contact
137+
[[email protected]](mailto:[email protected]) with any additional
138+
questions or comments.
139+
140+
To give feedback and/or report an issue, open a [GitHub
141+
Issue](https://help.github.com/articles/creating-an-issue/).
142+
143+
144+
*Apache®, Apache Spark, and Spark® are either registered trademarks or
145+
trademarks of the Apache Software Foundation in the United States and/or other
146+
countries.*

0 commit comments

Comments
 (0)