Skip to content

Commit 0f217e9

Browse files
authored
#5 Add warning (#6)
1 parent 306db32 commit 0f217e9

File tree

2 files changed

+77
-34
lines changed

2 files changed

+77
-34
lines changed

.gitignore

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -127,3 +127,4 @@ dmypy.json
127127

128128
# Pyre type checker
129129
.pyre/
130+
.history

README.md

Lines changed: 76 additions & 34 deletions
Original file line numberDiff line numberDiff line change
@@ -1,27 +1,34 @@
11
# code-snippets-v4
22

3+
## :warning: Warning
4+
5+
This repository is specifically for Senzing SDK V4.
6+
It is not designed to work with Senzing API V3.
7+
8+
To find the Senzing API V3 version of this repository, visit [code-snippets-v3].
9+
310
## Overview
411

512
Succinct examples of how you might use the Senzing APIs for operational tasks.
613

714
## Contents
815

9-
1. [Legend](#legend)
10-
1. [Warning](#warning)
11-
1. [Senzing Engine Configuration](#senzing-engine-configuration)
12-
1. [Senzing APIs Bare Metal Usage](#senzing-apis-bare-metal-usage)
13-
1. [Configuration](#configuration)
14-
2. [Usage](#usage)
15-
1. [Docker Usage](#docker-usage)
16-
1. [Configuration](#configuration-1)
17-
2. [Usage](#usage-1)
18-
1. [Items of Note](#items-of-note)
19-
1. [With Info](#with-info)
20-
2. [Parallel Processing](#parallel-processing)
21-
3. [Scalability](#scalability)
22-
4. [Randomize Input Files](#randomize-input-files)
23-
5. [Purging Senzing Repository Between Examples](#purging-senzing-repository-between-examples)
24-
6. [Input Load File Sizes](#input-load-file-sizes)
16+
1. [Legend]
17+
1. [Warning]
18+
1. [Senzing Engine Configuration]
19+
1. [Senzing APIs Bare Metal Usage]
20+
1. [Configuration]
21+
2. [Usage]
22+
1. [Docker Usage]
23+
1. [Configuration]
24+
2. [Usage]
25+
1. [Items of Note]
26+
1. [With Info]
27+
2. [Parallel Processing]
28+
3. [Scalability]
29+
4. [Randomize Input Data]
30+
5. [Purging Senzing Repository Between Examples]
31+
6. [Input Load File Sizes]
2532

2633
### Legend
2734

@@ -33,7 +40,7 @@ Succinct examples of how you might use the Senzing APIs for operational tasks.
3340

3441
## Warning
3542

36-
:warning::warning::warning: **Only run the code snippets against a test Senzing database instance.** Running the snippets adds and deletes data, and some snippets purge the entire database of currently ingested data. It is recommended to create a separate test Senzing project if you are using a bare metal Senzing install, or if using Docker a separate Senzing database to use only with the snippets. If you are getting started and are unsure please contact [Senzing Support](https://senzing.zendesk.com/hc/en-us/requests/new). :warning::warning::warning:
43+
:warning::warning::warning: **Only run the code snippets against a test Senzing database instance.** Running the snippets adds and deletes data, and some snippets purge the entire database of currently ingested data. It is recommended to create a separate test Senzing project if you are using a bare metal Senzing install, or if using Docker a separate Senzing database to use only with the snippets. If you are getting started and are unsure please contact [Senzing Support]. :warning::warning::warning:
3744

3845
## Senzing Engine Configuration
3946

@@ -56,27 +63,32 @@ The JSON configuration string is set via the environment variable `SENZING_ENGIN
5663

5764
## Senzing APIs Bare Metal Usage
5865

59-
You may already have installed the Senzing APIs and created a Senzing project by following the [Quickstart Guide](https://senzing.zendesk.com/hc/en-us/articles/115002408867-Quickstart-Guide). If not, and you would like to install the Senzing APIs directly on a machine, follow the steps in the[ Quickstart Guide](https://senzing.zendesk.com/hc/en-us/articles/115002408867-Quickstart-Guide). Be sure to review the API [Quickstart Roadmap](https://senzing.zendesk.com/hc/en-us/articles/115001579954-API-Quickstart-Roadmap), especially the [System Requirements](https://senzing.zendesk.com/hc/en-us/articles/115010259947).
66+
You may already have installed the Senzing APIs and created a Senzing project by following the [Quickstart Guide]. If not, and you would like to install the Senzing APIs directly on a machine, follow the steps in the [Quickstart Guide]. Be sure to review the API [Quickstart Roadmap], especially the [System Requirements].
6067

6168
### Configuration
6269

6370
When using a bare metal install, the initialization parameters used by the Senzing Python utilities are maintained within `<project_path>/etc/G2Module.ini`.
6471

6572
🤔To convert an existing Senzing project G2Module.ini file to a JSON string use one of the following methods:
6673

67-
- [G2ModuleIniToJson.py](Python/Tasks/Initialization/)
74+
- [G2ModuleIniToJson.py]
6875

6976
- Modify the path to your projects G2Module.ini file.
7077

71-
- [jc](https://github.com/kellyjonbrazil/jc)
78+
- [jc]
79+
7280
- ```console
7381
cat <project_path>/etc/G2Module.ini | jc --ini
7482
```
83+
7584
- Python one liner
85+
7686
- ```python
7787
python3 -c $'import configparser; ini_file_name = "<project_path>/etc/G2Module.ini";engine_config_json = {};cfgp = configparser.ConfigParser();cfgp.optionxform = str;cfgp.read(ini_file_name)\nfor section in cfgp.sections(): engine_config_json[section] = dict(cfgp.items(section))\nprint(engine_config_json)'
7888
```
79-
- [SenzingGo.py](https://github.com/Senzing/senzinggo)
89+
90+
- [SenzingGo.py]
91+
8092
- ```console
8193
<project_path>/python/SenzingGo.py --iniToJson
8294
```
@@ -86,27 +98,27 @@ When using a bare metal install, the initialization parameters used by the Senzi
8698
### Usage
8799

88100
1. Clone this repository
89-
2. Export the engine configuration obtained for your project from [Configuration](#configuration), e.g.,
101+
1. Export the engine configuration obtained for your project from [Configuration], e.g.,
90102

91103
```console
92104
export SENZING_ENGINE_CONFIGURATION_JSON='{"PIPELINE": {"SUPPORTPATH": "/<project_path>/data", "CONFIGPATH": "<project_path>/etc", "RESOURCEPATH": "<project_path>/resources"}, "SQL": {"CONNECTION": "postgresql://user:password@host:5432:g2"}}'
93105
```
94106

95-
3. Source the Senzing project setupEnv file
107+
1. Source the Senzing project setupEnv file
96108

97109
```console
98110
source <project_path>/setupEnv
99111
```
100112

101-
4. Run code snippets
113+
1. Run code snippets
102114

103115
:pencil2: `<project_path>` in the above examples should point to your project.
104116

105117
## Docker Usage
106118

107-
The included Dockerfile leverages the [Senzing API runtime](https://github.com/Senzing/senzingapi-runtime) image to provide an environment to run the code snippets.
119+
The included Dockerfile leverages the [Senzing API runtime] image to provide an environment to run the code snippets.
108120

109-
### Configuration
121+
### Configuration for Docker usage
110122

111123
When used with a container, the JSON configuration is relative to the paths within the container. The JSON configuration should look like:
112124

@@ -125,23 +137,23 @@ When used with a container, the JSON configuration is relative to the paths with
125137

126138
✏️You only need to modify the `CONNECTION` string to point to your Senzing database.
127139

128-
### Usage
140+
### Usage for Dccker usage
129141

130142
1. Clone this repository
131-
2. Export the engine configuration environment variable
143+
1. Export the engine configuration environment variable
132144

133145
```console
134146
export SENZING_ENGINE_CONFIGURATION_JSON='{"PIPELINE": {"CONFIGPATH": "/etc/opt/senzing", "RESOURCEPATH": "/opt/senzing/g2/resources", "SUPPORTPATH": "/opt/senzing/data"}, "SQL": {"CONNECTION": "postgresql://user:password@host:5432:g2"}}'
135147
```
136148

137-
3. Build the Docker image
149+
1. Build the Docker image
138150

139151
```console
140152
cd <repository_dir>
141153
docker build --tag senzing/code-snippets-v4 .
142154
```
143155

144-
4. Run a container
156+
1. Run a container
145157

146158
```console
147159
docker run \
@@ -174,7 +186,7 @@ A feature of Senzing is the capability to pass changes from data manipulation AP
174186
}
175187
```
176188

177-
The AFFECTED_ENTITIES object contains a list of all entity IDs affected. Separate processes can query the affected entities and synchronize changes and information to downstream systems. For additional information see [Real-time replication and analytics](https://senzing.zendesk.com/hc/en-us/articles/4417768234131--Advanced-Real-time-replication-and-analytics).
189+
The AFFECTED_ENTITIES object contains a list of all entity IDs affected. Separate processes can query the affected entities and synchronize changes and information to downstream systems. For additional information see [Real-time replication and analytics].
178190

179191
### Parallel Processing
180192

@@ -190,14 +202,44 @@ If a single very large load file and 3 machines were available for performing da
190202

191203
When providing your own input file(s) to the snippets or your own applications and processing data manipulation tasks (adding, deleting, replacing), it is important to randomize the file(s) or other input methods when running multiple threads. If source records that pertain to the same entity are clustered together, multiple processes or threads could all be trying to work on the same entity concurrently. This causes contention and overhead resulting in slower performance. To prevent this contention always randomize input data.
192204

193-
You may be able to randomize your input files during ETL and mapping the source data to the [Senzing Entity Specification](https://senzing.zendesk.com/hc/en-us/articles/231925448-Generic-Entity-Specification). Otherwise utilities such as [shuf](https://man7.org/linux/man-pages/man1/shuf.1.html) or [terashuf](https://github.com/alexandres/terashuf) for large files can be used.
205+
You may be able to randomize your input files during ETL and mapping the source data to the [Senzing Entity Specification]. Otherwise utilities such as [shuf] or [terashuf] for large files can be used.
194206

195207
### Purging Senzing Repository Between Examples
196208

197209
When trying out different examples you may notice consecutive tasks complete much faster than an initial run. For example, running a loading task for the first time without the data in the system will be representative of load rate. If the same example is subsequently run again without purging the system it will complete much faster. This is because Senzing knows the records already exist in the system and it skips them.
198210

199-
To run the same example again and see representative performance, first [purge](Python/Tasks/Initialization/PurgeRepository.py) the Senzing repository of the loaded data. Some examples don't require purging between running them, an example would be the deleting examples that require data to be ingested first. See the usage notes for each task category for an overview of how to use the snippets.
211+
To run the same example again and see representative performance, first [purge] the Senzing repository of the loaded data. Some examples don't require purging between running them, an example would be the deleting examples that require data to be ingested first. See the usage notes for each task category for an overview of how to use the snippets.
200212

201213
### Input Load File Sizes
202214

203-
There are different sized load files within the [Data](Resources/Data/) path that can be used to decrease or increase the volume of data loaded depending on the specification of your hardware. The files are named loadx.json, where the x specifies the number of records in the file.
215+
There are different sized load files within the [Data] path that can be used to decrease or increase the volume of data loaded depending on the specification of your hardware. The files are named loadx.json, where the x specifies the number of records in the file.
216+
217+
[code-snippets-v3]: https://github.com/Senzing/code-snippets-v3
218+
[Configuration]: #configuration
219+
[Data]: Resources/Data/
220+
[Docker Usage]: #docker-usage
221+
[G2ModuleIniToJson.py]: Python/Tasks/Initialization/
222+
[Input Load File Sizes]: #input-load-file-sizes
223+
[Items of Note]: #items-of-note
224+
[jc]: https://github.com/kellyjonbrazil/jc
225+
[Legend]: #legend
226+
[Parallel Processing]: #parallel-processing
227+
[purge]: Python/Tasks/Initialization/PurgeRepository.py
228+
[Purging Senzing Repository Between Examples]: #purging-senzing-repository-between-examples
229+
[Quickstart Guide]: https://senzing.zendesk.com/hc/en-us/articles/115002408867-Quickstart-Guide
230+
[Quickstart Roadmap]: https://senzing.zendesk.com/hc/en-us/articles/115001579954-API-Quickstart-Roadmap
231+
[Randomize Input Data]: #randomize-input-data
232+
[Real-time replication and analytics]: https://senzing.zendesk.com/hc/en-us/articles/4417768234131--Advanced-Real-time-replication-and-analytics
233+
[Scalability]: #scalability
234+
[Senzing API runtime]: https://github.com/Senzing/senzingapi-runtime
235+
[Senzing APIs Bare Metal Usage]: #senzing-apis-bare-metal-usage
236+
[Senzing Engine Configuration]: #senzing-engine-configuration
237+
[Senzing Entity Specification]: https://senzing.zendesk.com/hc/en-us/articles/231925448-Generic-Entity-Specification
238+
[Senzing Support]: https://senzing.zendesk.com/hc/en-us/requests/new
239+
[SenzingGo.py]: https://github.com/Senzing/senzinggo
240+
[shuf]: https://man7.org/linux/man-pages/man1/shuf.1.html
241+
[System Requirements]: https://senzing.zendesk.com/hc/en-us/articles/115010259947
242+
[terashuf]: https://github.com/alexandres/terashuf
243+
[Usage]: #usage
244+
[Warning]: #warning
245+
[With Info]: #with-info

0 commit comments

Comments
 (0)