Skip to content

Commit fa7a513

Browse files
Bruce Irschickbirschick-bq
andauthored
[AD-1014] Developer Guide. (#451)
* [AD-1014] Developer Guide. * Commit Code Coverage Badge * [AD-1014] Updates to use existing GETTING_STARTED.md and added schema-caching.md * Commit Code Coverage Badge Co-authored-by: birschick-bq <[email protected]>
1 parent b1ee65c commit fa7a513

File tree

4 files changed

+202
-24
lines changed

4 files changed

+202
-24
lines changed

GETTING_STARTED.md

Lines changed: 79 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -156,8 +156,45 @@ rather than the cluster endpoint since we have set up the SSH tunnel.
156156
~~~
157157
mongo --host 127.0.0.1:27017 --username <master-username> --password <master-password>
158158
~~~
159+
160+
## Database User Account Definitions
161+
162+
The integration tests assume the following two user accounts are created
163+
in the target database server.
164+
165+
### Administrative User
166+
167+
User: `documentdb`
168+
169+
#### Definition:
170+
171+
```json
172+
{
173+
"user" : "documentdb",
174+
"roles" : [ {
175+
"db" : "admin",
176+
"role" : "root"
177+
} ]
178+
}
179+
```
180+
181+
### Restricted Access User
182+
183+
User: `docDbRestricted`
184+
185+
#### Definition
186+
187+
```json
188+
{
189+
"user" : "docDbRestricted",
190+
"roles" : [ {
191+
"db" : "admin",
192+
"role" : "readAnyDatabase"
193+
} ]
194+
}
195+
```
159196
160-
##### Connect with TLS
197+
## Connect with TLS
161198
When connecting to a TLS-enabled cluster you can follow the same steps to set up an SSH tunnel but will need to also
162199
download the Amazon DocumentDB Certificate Authority (CA) file before trying to connect.
163200
1. Download the CA file.
@@ -178,8 +215,8 @@ access the cluster from localhost, the server certificate does not match the hos
178215
mongo --host 127.0.0.1:27017 --username <master-username> --password <master-password> --tls --tlsCAFile rds-combined-ca-bundle.pem --tlsAllowInvalidHostnames
179216
~~~
180217
181-
##### Connect Programmatically
182-
###### Without TLS
218+
### Connect Programmatically
219+
#### Without TLS
183220
Connecting without TLS is very straightforward. We essentially follow the same steps as when connecting using the
184221
`mongo` shell.
185222
1. Setup the SSH tunnel. See step 3 in section [Setting Up Environment Variables](#setting-up-environment-variables) for
@@ -201,7 +238,7 @@ Make sure to set the hostname, username, password and target database. The targe
201238
}
202239
~~~
203240

204-
###### With TLS
241+
#### With TLS
205242
Connecting with TLS programmatically is slightly different from how we did it with the `mongo` shell.
206243
1. Create a test or simple main to run.
207244
2. Use either the Driver Manager, Data Source class or Connection class to establish a connection to `localhost:27017`.
@@ -224,36 +261,57 @@ class:
224261
}
225262
~~~
226263

227-
#### Setting Up Environment Variables
228-
1. Create and set the Environment Variables:
264+
## Integration Testing
229265

230-
~~~
231-
DOC_DB_USER_NAME=<secret-username>
232-
DOC_DB_PASSWORD=<secret-password>
233-
DOC_DB_LOCAL_PORT=27019
234-
DOC_DB_USER=<ec2-username>@<public-IPv4-DNS-name>
235-
DOC_DB_HOST=<cluster-endpoint>
236-
DOC_DB_PRIV_KEY_FILE=~/.ssh/<key-pair-name>.pem
237-
~~~
266+
By default, integration testing is disabled for local development. To enable
267+
integration testing, follow the directions below.
268+
269+
### Setting Up Environment Variables
238270

239-
2. Ensure the private key file <key pair name>.pem is in the location set by the environment variable
271+
To enable integration testing the following environment variables allow
272+
you to customize the credentials and DocumentDB cluster settings.
273+
274+
1. Create and set the following environment variables:
275+
276+
| Variable | Description | Example |
277+
|------------------------|--------------------------------------------------------------------------------------------------------------------------|-----------------------------------------------------------------------------|
278+
| `DOC_DB_USER_NAME` | This is the DocumentDB user. | `documentdb` |
279+
| `DOC_DB_PASSWORD` | This is the DocumentDB password. | `aSecret` |
280+
| `DOC_DB_LOCAL_PORT` | This is the port number used locally via an SSH Tunnel. It is recommend to use a different value than the default 27017. | `27019` |
281+
| `DOC_DB_USER` | This is the user and host of SSH Tunnel EC2 instance. | `[email protected]` |
282+
| `DOC_DB_HOST` | This is the host of the DocumentDB cluster server. | `docdb-jdbc-literal-test.cluster-abcdefghijk.us-east-2.docdb.amazonaws.com` |
283+
| `DOC_DB_PRIV_KEY_FILE` | This is the path to the SSH Tunnel private key-pair file. | `~/.ssh/ec2-literal.pem` |
284+
285+
### SSH Tunnel
286+
287+
1. Ensure the private key file <key pair name>.pem is in the location set by the environment variable
240288
`DOC_DB_PRIV_KEY_FILE`.
241-
3. Start an SSH port-forwarding tunnel:
289+
2. Assuming you have the environment variables setup above, starting an SSH tunnel from the command line should look like this:
242290

291+
~~~shell
292+
ssh [-f] -N -i $DOC_DB_PRIV_KEY_FILE -L $DOC_DB_LOCAL_PORT:$DOC_DB_HOST:27017 $DOC_DB_USER
243293
~~~
244-
ssh [-f] -N -i ~/.ssh/<key-pair-name>.pem -L $DOC_DB_LOCAL_PORT:$DOC_DB_HOST:27017 $DOC_DB_USER
245-
~~~
246-
294+
247295
- The `-L` flag defines the port forwarded to the remote host and remote port. Adding the `-N` flag means do not
248296
execute a remote command, you will not get a shell in this case. The `-f` switch instructs SSH to run in the
249297
background.
250298

251-
#### Bypass Testing DocumentDB
299+
### Enable Integration Testing of Amazon DocumentDB
300+
301+
To enable integration testing in the IDE, update the grade property, as intructed below.
302+
252303
1. Modify the */gradle.properties* file in the source code and uncomment the following line:
253-
`runRemoteIntegrationTests=false`
304+
`runRemoteIntegrationTests=true`
305+
306+
### Project Secrets
307+
308+
For the purposes of automated integration testing in **GitHub**, this project maintains the value for the environment variables above
309+
as project secrets. See the workflow file [gradle.yml](https://github.com/aws/amazon-documentdb-jdbc-driver/blob/1edd9e21fdcccfe62d366580702f2904136298e5/.github/workflows/gradle.yml)
254310

255311
## Troubleshooting
312+
256313
### Issues with JDK
314+
257315
1. Confirm project SDK is Java Version 1.8 via the IntelliJ top menu toolbar under
258316
*File → Project Structure → Platform Settings -> SDK* and reload the JDK home path by browsing to the path and click
259317
*apply* and *ok*. Restart IntelliJ IDEA.
@@ -277,5 +335,3 @@ class:
277335
below. Go to EC2 Dashboard → **Network & Security** Group in the left menu → **Security** Group.
278336

279337
![Security Policy for EC2 Instance](src/markdown/images/getting-started/security-policy-ec2-instance.png)
280-
281-

README.md

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -67,4 +67,9 @@ your issue.
6767

6868
## Security Notice
6969

70-
If you discover a potential security issue in this project, please consult our [security guidance page](SECURITY.md).
70+
If you discover a potential security issue in this project, please consult our [security guidance page](SECURITY.md).
71+
72+
## Contributor's Getting Started Guide
73+
74+
If you're a developer and want to contribute to this project, ensure to read and follow the
75+
[Getting Started as a Developer](GETTING_STARTED.md) guide.

src/markdown/index.md

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -51,6 +51,12 @@ The Amazon DocumentDB JDBC driver can perform automatic schema discovery and gen
5151
DocumentDB schema mapping. See the [schema discovery documentation](schema/schema-discovery.md)
5252
for more details of this process.
5353

54+
## Schema Caching
55+
56+
Once schema is discovered, it is cached in the database to improve performance for subsequent access.
57+
See the [schema caching documentation](schema/schema-caching.md) to learn
58+
more about schema caching behaviour and access requirements.
59+
5460
## Schema Management
5561

5662
The SQL to DocumentDB schema mapping can be managed in the following ways:

src/markdown/schema/schema-caching.md

Lines changed: 111 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,111 @@
1+
# Schema Caching
2+
3+
## Schema Caching Behaviour
4+
5+
When a connection is made to an Amazon DocumentDB database, the Amazon DocumentDB JDBC driver
6+
checks for a previously cached version of the mapped schema. If a previous version exists,
7+
the latest version of the cached schema is read and used for all further interaction with the database.
8+
9+
If a previously cached version does not exist, the process of [schema discover](schema-discovery.md) is automatically
10+
started on all the accessible collections in the database. The discovery process uses the properties
11+
`scanMethod` (default `random`), and `scanLimit` (default `1000`) when sampling documents from the database.
12+
At the end of the discovery process, the resulting schema mapping is written to the cache using the name
13+
associated with the property `schemaName` (default `_default`).
14+
15+
If some reason the resulting schema cannot be saved to the cache, the resulting schema will still be used
16+
in-memory for the life of the connection. The implication of not having access to a cached version of the
17+
schema is that the schema discovery will have to be performed for each connection - which could have a seriously
18+
negative impact on performance.
19+
20+
## Cache Location
21+
22+
The SQL schema mapping cache is stored in two collections on the same database as
23+
the sampled collections. The collection `_sqlSchemas` stores the names and versions of
24+
all the sampled schemas for the given database. The collection `_sqlTableSchemas` stores the
25+
column to field mappings for all the cached SQL schema mappings. The two cache collections
26+
have a strong parent/child relationship and must be maintained in a consistent way. Always use
27+
the [schema management CLI](manage-schema-cli.md) to ensure consistency in the cache collections.
28+
29+
## User Permissions for Creating and Updating the Schema Cache
30+
31+
To be able to store or update the SQL schema mappings to the cache collections, the connected
32+
Amazon DocumentDB user account must have write permissions to create and update the
33+
cache collections. Once the schema is cached, users need only read permission on the
34+
cache collections.
35+
36+
To allow access for an Amazon DocumentDB user, ensure to set or add the appropriate roles as
37+
described below.
38+
39+
### Enable Access per Database
40+
41+
To allow read and write access to specific databases in your server, add
42+
a `readWrite` [built-in role](https://www.mongodb.com/docs/manual/reference/built-in-roles/#mongodb-authrole-readWrite)
43+
for each database the user should have access to be able to create and update the cached schema for specific
44+
databases.
45+
46+
```json
47+
roles: [
48+
{role: "readWrite", db: "yourDatabase1"},
49+
{role: "readWrite", db: "yourDatabase2"} ...
50+
]
51+
```
52+
53+
### Enable Access for Any Database
54+
55+
To allow read and write access to any databases in your server, add
56+
a `readWriteAnyDatabase` [built-in role](https://www.mongodb.com/docs/manual/reference/built-in-roles/#mongodb-authrole-readWriteAnyDatabase)
57+
on the `admin` database to be able to create and update the cached schema in any database.
58+
59+
```json
60+
roles: [
61+
{role: "readWriteAnyDatabase", db: "admin"}
62+
]
63+
```
64+
65+
### Collection-Level Access Control
66+
67+
If [collection-level access control](https://www.mongodb.com/docs/manual/core/collection-level-access-control/)
68+
is implemented, then ensure `find`, `insert`, and `update` actions are
69+
allowed on the cache collections (`_sqlSchemas` and `_sqlTableSchemas`)
70+
71+
## User Permissions for Reading an Existing Schema Cache
72+
73+
To be able to read the SQL schema mappings to the cache collections, the connected
74+
Amazon DocumentDB user account must have read permissions to read the
75+
cache collections.
76+
77+
To allow access for an Amazon DocumentDB user, ensure to set or add the appropriate roles as
78+
described below.
79+
80+
### Enable Access per Database
81+
82+
To allow read access to specific databases in your server, add
83+
a `read` [built-in role](https://www.mongodb.com/docs/manual/reference/built-in-roles/#mongodb-authrole-read)
84+
for each database the user should have access to be able to read the cached schema for specific
85+
databases.
86+
87+
```json
88+
roles: [
89+
{role: "read", db: "yourDatabase1"},
90+
{role: "read", db: "yourDatabase2"} ...
91+
]
92+
```
93+
94+
### Enable Access for Any Database
95+
96+
To allow read access to any databases in your server, add
97+
a `readAnyDatabase` [built-in role](https://www.mongodb.com/docs/manual/reference/built-in-roles/#mongodb-authrole-readAnyDatabase)
98+
on the `admin` database to be able to read the cached schema in any database.
99+
100+
```json
101+
roles: [
102+
{role: "readAnyDatabase", db: "admin"}
103+
]
104+
```
105+
106+
### Collection-Level Access Control
107+
108+
If [collection-level access control](https://www.mongodb.com/docs/manual/core/collection-level-access-control/)
109+
is implemented, then ensure `find` actions are
110+
allowed on the cache collections (`_sqlSchemas` and `_sqlTableSchemas`)
111+

0 commit comments

Comments
 (0)