Skip to content

Commit 14d6460

Browse files
author
Ace Nassri
authored
Add DLP samples (BigQuery, DeID, RiskAnalysis) (GoogleCloudPlatform#474)
* Add BigQuery samples + a few minor tweaks * Update comments + fix failing test * Sync w/codegen changes * Add DeID samples * Add DeID tests + remove infoTypes from DeID samples * Remove unused option * Add risk analysis samples * Update README * Add region tags + fix comment
1 parent 606a9d3 commit 14d6460

13 files changed

+908
-34
lines changed

dlp/.gitignore

+1
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
**/*.result.png

dlp/README.md

+65-2
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,8 @@ The [Data Loss Prevention API](https://cloud.google.com/dlp/docs/) provides prog
1313
* [Inspect](#inspect)
1414
* [Redact](#redact)
1515
* [Metadata](#metadata)
16+
* [DeID](#deid)
17+
* [Risk Analysis](#risk-analysis)
1618
* [Running the tests](#running-the-tests)
1719

1820
## Setup
@@ -47,6 +49,7 @@ Commands:
4749
Prevention API and the promise pattern.
4850
gcsFileEvent <bucketName> <fileName> Inspects a text file stored on Google Cloud Storage using the Data Loss
4951
Prevention API and the event-handler pattern.
52+
bigquery <datasetName> <tableName> Inspects a BigQuery table using the Data Loss Prevention API.
5053
datastore <kind> Inspect a Datastore instance using the Data Loss Prevention API.
5154
5255
Options:
@@ -56,14 +59,15 @@ Options:
5659
[default: "LIKELIHOOD_UNSPECIFIED"]
5760
-f, --maxFindings [number] [default: 0]
5861
-q, --includeQuote [boolean] [default: true]
59-
-l, --languageCode [string] [default: "en-US"]
60-
-t, --infoTypes [array] [default: []]
62+
-t, --infoTypes [array] [default: ["PHONE_NUMBER","EMAIL_ADDRESS","CREDIT_CARD_NUMBER"]]
6163
6264
Examples:
6365
node inspect.js string "My phone number is (123) 456-7890 and my email address is [email protected]"
6466
node inspect.js file resources/test.txt
6567
node inspect.js gcsFilePromise my-bucket my-file.txt
6668
node inspect.js gcsFileEvent my-bucket my-file.txt
69+
node inspect.js bigquery my-dataset my-table
70+
node inspect.js datastore my-datastore-kind
6771
6872
For more information, see https://cloud.google.com/dlp/docs. Optional flags are explained at
6973
https://cloud.google.com/dlp/docs/reference/rest/v2beta1/content/inspect#InspectConfig
@@ -81,6 +85,7 @@ __Usage:__ `node redact.js --help`
8185
```
8286
Commands:
8387
string <string> <replaceString> Redact sensitive data from a string using the Data Loss Prevention API.
88+
image <filepath> <outputPath> Redact sensitive data from an image using the Data Loss Prevention API.
8489
8590
Options:
8691
--help Show help [boolean]
@@ -91,6 +96,7 @@ Options:
9196
9297
Examples:
9398
node redact.js string "My name is Gary" "REDACTED" -t US_MALE_NAME
99+
node redact.js image resources/test.png redaction_result.png -t US_MALE_NAME
94100
95101
For more information, see https://cloud.google.com/dlp/docs. Optional flags are explained at
96102
https://cloud.google.com/dlp/docs/reference/rest/v2beta1/content/inspect#InspectConfig
@@ -124,6 +130,63 @@ For more information, see https://cloud.google.com/dlp/docs
124130
[metadata_2_docs]: https://cloud.google.com/dlp/docs
125131
[metadata_2_code]: metadata.js
126132

133+
### DeID
134+
135+
View the [documentation][deid_3_docs] or the [source code][deid_3_code].
136+
137+
__Usage:__ `node deid.js --help`
138+
139+
```
140+
Commands:
141+
mask <string> Deidentify sensitive data by masking it with a character.
142+
fpe <string> <wrappedKey> <keyName> Deidentify sensitive data using Format Preserving Encryption (FPE).
143+
144+
Options:
145+
--help Show help [boolean]
146+
147+
Examples:
148+
node deid.js mask "My SSN is 372819127"
149+
node deid.js fpe "My SSN is 372819127" <YOUR_ENCRYPTED_AES_256_KEY> <YOUR_KEY_NAME>
150+
151+
For more information, see https://cloud.google.com/dlp/docs.
152+
```
153+
154+
[deid_3_docs]: https://cloud.google.com/dlp/docs
155+
[deid_3_code]: deid.js
156+
157+
### Risk Analysis
158+
159+
View the [documentation][risk_4_docs] or the [source code][risk_4_code].
160+
161+
__Usage:__ `node risk.js --help`
162+
163+
```
164+
Commands:
165+
numerical <datasetId> <tableId> <columnName> Computes risk metrics of a column of numbers in a Google
166+
BigQuery table.
167+
categorical <datasetId> <tableId> <columnName> Computes risk metrics of a column of data in a Google
168+
BigQuery table.
169+
kAnonymity <datasetId> <tableId> [quasiIdColumnNames..] Computes the k-anonymity of a column set in a Google
170+
BigQuery table.
171+
lDiversity <datasetId> <tableId> <sensitiveAttribute> Computes the l-diversity of a column set in a Google
172+
[quasiIdColumnNames..] BigQuery table.
173+
174+
Options:
175+
--help Show help [boolean]
176+
-p, --projectId [string] [default: "nodejs-docs-samples"]
177+
178+
Examples:
179+
node risk.js numerical nhtsa_traffic_fatalities accident_2015 state_number -p bigquery-public-data
180+
node risk.js categorical nhtsa_traffic_fatalities accident_2015 state_name -p bigquery-public-data
181+
node risk.js kAnonymity nhtsa_traffic_fatalities accident_2015 state_number county -p bigquery-public-data
182+
node risk.js lDiversity nhtsa_traffic_fatalities accident_2015 city state_number county -p bigquery-public-data
183+
184+
For more information, see https://cloud.google.com/dlp/docs.
185+
```
186+
187+
[risk_4_docs]: https://cloud.google.com/dlp/docs
188+
[risk_4_code]: risk.js
189+
127190
## Running the tests
128191

129192
1. Set the **GCLOUD_PROJECT** and **GOOGLE_APPLICATION_CREDENTIALS** environment variables.

dlp/deid.js

+163
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,163 @@
1+
/**
2+
* Copyright 2017, Google, Inc.
3+
* Licensed under the Apache License, Version 2.0 (the "License");
4+
* you may not use this file except in compliance with the License.
5+
* You may obtain a copy of the License at
6+
*
7+
* http://www.apache.org/licenses/LICENSE-2.0
8+
*
9+
* Unless required by applicable law or agreed to in writing, software
10+
* distributed under the License is distributed on an "AS IS" BASIS,
11+
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
12+
* See the License for the specific language governing permissions and
13+
* limitations under the License.
14+
*/
15+
16+
'use strict';
17+
18+
function deidentifyWithMask (string, maskingCharacter, numberToMask) {
19+
// [START deidentify_masking]
20+
// Imports the Google Cloud Data Loss Prevention library
21+
const DLP = require('@google-cloud/dlp');
22+
23+
// Instantiates a client
24+
const dlp = new DLP.DlpServiceClient();
25+
26+
// The string to deidentify
27+
// const string = 'My SSN is 372819127';
28+
29+
// (Optional) The maximum number of sensitive characters to mask in a match
30+
// If omitted from the request or set to 0, the API will mask any matching characters
31+
// const numberToMask = 5;
32+
33+
// (Optional) The character to mask matching sensitive data with
34+
// const maskingCharacter = 'x';
35+
36+
// Construct deidentification request
37+
const items = [{ type: 'text/plain', value: string }];
38+
const request = {
39+
deidentifyConfig: {
40+
infoTypeTransformations: {
41+
transformations: [{
42+
primitiveTransformation: {
43+
characterMaskConfig: {
44+
maskingCharacter: maskingCharacter,
45+
numberToMask: numberToMask
46+
}
47+
}
48+
}]
49+
}
50+
},
51+
items: items
52+
};
53+
54+
// Run deidentification request
55+
dlp.deidentifyContent(request)
56+
.then((response) => {
57+
const deidentifiedItems = response[0].items;
58+
console.log(deidentifiedItems[0].value);
59+
})
60+
.catch((err) => {
61+
console.log(`Error in deidentifyWithMask: ${err.message || err}`);
62+
});
63+
// [END deidentify_masking]
64+
}
65+
66+
function deidentifyWithFpe (string, alphabet, keyName, wrappedKey) {
67+
// [START deidentify_fpe]
68+
// Imports the Google Cloud Data Loss Prevention library
69+
const DLP = require('@google-cloud/dlp');
70+
71+
// Instantiates a client
72+
const dlp = new DLP.DlpServiceClient();
73+
74+
// The string to deidentify
75+
// const string = 'My SSN is 372819127';
76+
77+
// The set of characters to replace sensitive ones with
78+
// For more information, see https://cloud.google.com/dlp/docs/reference/rest/v2beta1/content/deidentify#FfxCommonNativeAlphabet
79+
// const alphabet = 'ALPHA_NUMERIC';
80+
81+
// The name of the Cloud KMS key used to encrypt ('wrap') the AES-256 key
82+
// const keyName = 'projects/YOUR_GCLOUD_PROJECT/locations/YOUR_LOCATION/keyRings/YOUR_KEYRING_NAME/cryptoKeys/YOUR_KEY_NAME';
83+
84+
// The encrypted ('wrapped') AES-256 key to use
85+
// This key should be encrypted using the Cloud KMS key specified above
86+
// const wrappedKey = 'YOUR_ENCRYPTED_AES_256_KEY'
87+
88+
// Construct deidentification request
89+
const items = [{ type: 'text/plain', value: string }];
90+
const request = {
91+
deidentifyConfig: {
92+
infoTypeTransformations: {
93+
transformations: [{
94+
primitiveTransformation: {
95+
cryptoReplaceFfxFpeConfig: {
96+
cryptoKey: {
97+
kmsWrapped: {
98+
wrappedKey: wrappedKey,
99+
cryptoKeyName: keyName
100+
}
101+
},
102+
commonAlphabet: alphabet
103+
}
104+
}
105+
}]
106+
}
107+
},
108+
items: items
109+
};
110+
111+
// Run deidentification request
112+
dlp.deidentifyContent(request)
113+
.then((response) => {
114+
const deidentifiedItems = response[0].items;
115+
console.log(deidentifiedItems[0].value);
116+
})
117+
.catch((err) => {
118+
console.log(`Error in deidentifyWithFpe: ${err.message || err}`);
119+
});
120+
// [END deidentify_fpe]
121+
}
122+
123+
const cli = require(`yargs`)
124+
.demand(1)
125+
.command(
126+
`mask <string>`,
127+
`Deidentify sensitive data by masking it with a character.`,
128+
{
129+
maskingCharacter: {
130+
type: 'string',
131+
alias: 'c',
132+
default: ''
133+
},
134+
numberToMask: {
135+
type: 'number',
136+
alias: 'n',
137+
default: 0
138+
}
139+
},
140+
(opts) => deidentifyWithMask(opts.string, opts.maskingCharacter, opts.numberToMask)
141+
)
142+
.command(
143+
`fpe <string> <wrappedKey> <keyName>`,
144+
`Deidentify sensitive data using Format Preserving Encryption (FPE).`,
145+
{
146+
alphabet: {
147+
type: 'string',
148+
alias: 'a',
149+
default: 'ALPHA_NUMERIC',
150+
choices: ['NUMERIC', 'HEXADECIMAL', 'UPPER_CASE_ALPHA_NUMERIC', 'ALPHA_NUMERIC']
151+
}
152+
},
153+
(opts) => deidentifyWithFpe(opts.string, opts.alphabet, opts.keyName, opts.wrappedKey)
154+
)
155+
.example(`node $0 mask "My SSN is 372819127"`)
156+
.example(`node $0 fpe "My SSN is 372819127" <YOUR_ENCRYPTED_AES_256_KEY> <YOUR_KEY_NAME>`)
157+
.wrap(120)
158+
.recommendCommands()
159+
.epilogue(`For more information, see https://cloud.google.com/dlp/docs.`);
160+
161+
if (module === require.main) {
162+
cli.help().strict().argv; // eslint-disable-line
163+
}

0 commit comments

Comments
 (0)