- Set administrators
- Global graph settings
- User management
- Actions
- Demo data
- New data source
- Fields autocomplete
- Query auto-formatting rules
- Debug info
- Custom graph elements style
- Limit returned data
- Plugins development
After a fresh installation the service's environment is development
- all users have the same highest level rights. Therefore the first step is to set administrators.
- In the top-right corner click on
Options
and selectAdministration
- Select
Users
tab:
- Initially you have just one user - enable
Is admin
to make that user an administrator. The same option can be applied to the other users to make them administrators.
Now service is ready to run in a production mode. Open graphoscope.yaml
file, set environment: prod
and restart the service.
Settings
tab contains many settings related to the styling of nodes/edges and interaction with graph elements. Some individual settings can be found in a Profile
section.
In the same window you can:
- Reset user's password - the user will now be able to sign up with the same username again, what is now allowed with a non-empty password.
- Delete user
- Give or remove admin rights
On Actions
tab some actions can be made without restarting the service. For example: reload collectors.
After a production environment installation is completed it includes a CSV file as a first demo data source:
- CSV file with all the data included, headers in the first line -
/opt/graphoscope/files/demo.csv
- Graphoscope data source definition -
/opt/graphoscope/definitions/sources/demo.yaml
... simple list of 10 people and their friends. The first query could be requesting all people with an age over 30:
FROM demo WHERE age > 30
The results will be similar to:
Now it's possible to extend the graph by searching for more of John's neighbors - right click on John
and choose Search Demo
. We find that Jennifer and Kate also are his friends:
At this point users can continue reading UI and Search documentation sections.
To add a new data source the only thing that is required is to add its definition in definitions/sources/
directory. Definition step by step:
Data source name to query by users, its Web GUI display label and icon:
name: demo
label: Demo
icon: database
Plugin to use:
plugin: file-csv
Whether this source should be queried when global
namespace is requested:
inGlobal: false
... some sources can be very slow. To prevent every request being slow - you can exclude such sources from the global
namespace.
Whether data source should process datetime
requested range, which is always added by the Web GUI:
includeDatetime: false
... some data sources don't have a timestamp field, so no data will ever be returned.
supportsSQL: true
... whether the data source supports SQL features. Access details:
access:
path: files/demo.csv
List of relations to create as Graphoscope can't guess the logic behind random data structure:
relations:
-
from:
id: name
group: name
search: name
attributes: [ "age" ]
to:
id: country
group: country
search: country
edge:
label: lives in
attributes: []
-
from:
id: name
group: name
search: name
to:
id: friend
group: name
search: name
For more parameters and details check example file definitions/sources/source.yaml.example
.
How autocomplete works in a background:
- Service is started
- Connection is established to the each data source (if it's required)
- Each data source returns a list of available fields to query
- A global list of unique fields is being created and returned to the Web GUI
There are two ways to get a list of field names of the data source:
- Automatically. Such plugins know how to query a data source and return a list of fields
- Manually. As from some data sources there is no way to get all the possible fields - such list can be created manually by an administrator. Check plugin's README for more info. Then in a data source's YAML definition file use a
queryFields
setting to define all the possible fields:
queryFields:
- address
- domain
format
button converts a list of comma or space separated indicators into a correct SQL query. To allow the service to understand the type (group) of each indicator formatting rules must be created: files/formats.yaml.example
can be used as an example and shows the rules for some groups by default.
Syntax:
indicator's group:
- regexp 1 to detect it
- regexp 2 to detect it
- ...
Example:
email:
- ^[\%\w\.+-]+@[\%\w\.]+\.[\%a-zA-Z]{1,9}$
To add a non-default group - append it's name to the YAML file with a list of regexps to detect it and restart the service.
During queries several things happen in a background like SQL to Elasticsearch JSON query conversion, fields name adaptation, etc. Each plugin can save progress information and return to the user. Disabled by default, can be enabled in profile settings. Accessible in a browser's console.
It's possible to customize the style of graph nodes. The previous data source definition contained group: name
and group: country
- two styling groups, similar to the CSS classnames. To set your own styles change directory to the service's location and create a new file based on the example:
cd /opt/graphoscope/
ls files/groups.json
or in a dev. environment:
cd /opt/go/src/github.com/cert-lv/graphoscope
cp files/groups.json.example files/groups.json
Open groups.json
in a text editor and insert:
{
"name": {
"shape": "dot",
"color": {
"background": "#fc3",
"border": "#da3"
}
},
"country": {
"shape": "diamond",
"color": {
"background": "#f22",
"border": "#c22"
},
"font": {
"color": "#04a"
}
},
"cluster": {
"shape": "hexagon",
"size": 25,
"color": {
"background": "#777b7b",
"border": "#566"
},
"font": {
"color": "#ccc"
}
}
}
Here we describe both groups - shapes and all kinds of colors. cluster
is a built-in group for the clusters - when you combine all the same type neighbors into a one cluster node, to make the picture cleaner. Restart the service, reload web page and see the difference:
Possible shapes and more styling options at https://visjs.github.io/vis-network/docs/network/nodes.html.
Font icons and images also can represent nodes. A JavaScript framework that is being used includes a complete port of Font Awesome. groups.json
content example to use both image and font icon:
...
"name": {
"shape": "icon",
"icon": {
"face": "Icons",
"weight": "bold",
"code": "\uf0c0",
"size": "30",
"color": "#0d7"
},
"font": {
"color": "#05b"
}
},
"country": {
"shape": "image",
"image": "assets/img/logo.svg"
},
...
Result:
A cheatsheet of possible icons codes: https://fontawesome.com/cheatsheet. Limitations and tips:
- Can use
circularImage
shape instead ofimage
to make an image to be a circle. Example: https://visjs.github.io/vis-network/examples/network/nodeStyles/circularImages.html - Images must be uploaded first. For example, to the
assets/img/icons
- Selected nodes do NOT change their background color
- Size of the font icon will always stay the same, independently of the neighbors count
In a graphoscope.yaml
there is a setting limit: X
- max amount of returned entries from each data source. It prevents returning billions of entries and makes graph much cleaner. If any data source can return more entries - statistics info will be returned about these limited entries, so user is able to improve the query.
Every new and unique data source is different from the previous - communication methods, data structure, etc. That means technical implementation will also be different. In Graphoscope it is done with the help of plugins - one for each unique data source. For example, a MongoDB
plugin, or Elasticsearch
- allow to connect to these specific databases.
Existing built-in plugins can be found in a plugins/src/
directory. One plugin - one directory.
When there is a need to use a new data source - new plugin must be developed. Step-by-step workflow:
- Move to the plugins directory:
cd $GOPATH/src/github.com/cert-lv/graphoscope/plugins/src
- Make a copy of the template:
cp -r template <plugin-name>
cd <plugin-name>
... real plugin name should be instead of <plugin-name>
.
3. Rename an entry point and testing files:
mv template.go <plugin-name>.go
mv template_test.go <plugin-name>_test.go
- Follow the steps in the source files:
Edit <plugin-name>.go
- STEP 1 - choose plugin type - data collector or processor
- STEP 2 - validate required parameters or settings, given by the user
- STEP 3 - create a connection to the data source if needed, check whether it is established. For example,
MongoDB
requires an established connection, whileHTTP REST API
does not - STEP 4 - store plugin settings, like "client" object, URL, database name, etc.
- STEP 5 - get a list of all known data source's fields for the Web GUI autocomplete. Remove method for processor plugin!
- STEP 6 - choose and leave only one method -
Search()
for the collector orProcess()
for the processor
In case data source plugin type was chosen (steps 7-10):
- STEP 7 - when new query is launched - an SQL statement conversion must be done, so the data source can understand what client is searching for. Created query should be added to the debug info, so admin or developer can see what happens in a background.
Edit convert.go
- STEP 8 - do the SQL conversion. Check, for example, a
MongoDB
plugin to see how SQL can be converted to the hierarchical object or anHTTP
plugin where you get just a list of requestedfield/value
pairs. File not needed for the processor plugin!
Edit <plugin-name>.go
- STEP 9 - run the query and get the results. Implementation depends on the data source's methods. In an
HTTP
plugin it's just a GET/POST request - STEP 10 - process data returned by the data source. Most of this loop content you shouldn't modify at all
In case processor plugin type was chosen:
-
STEP 11 - process data received from the data source plugins
-
STEP 12 - gracefully stop the plugin when main service stops, drop all connections correctly
Edit plugin.go
- STEP 13 - inherit default configuration fields for the data source or processor plugin
- STEP 14 - define all the custom fields needed by the plugin, such as "client" object, database/collection name, etc. See the
STEP 3
- STEP 15 - set plugin name and version
Edit README.md
- STEP 16 - white plugin description and documentation if needed
Edit <plugin-name>_test.go
- STEP 17 - test plugin's functionality
During all these steps you can use existing plugins as the working examples.
Now test the source code:
cd $GOPATH/src/github.com/cert-lv/graphoscope
go test plugins/src/<plugin-name>/*.go
Compile in case of data source plugin:
go build -buildmode=plugin -ldflags="-w" -o plugins/sources/<plugin-name>.so plugins/src/<plugin-name>/*.go
or in case of processor plugin:
go build -buildmode=plugin -ldflags="-w" -o plugins/processors/<plugin-name>.so plugins/src/<plugin-name>/*.go
For a prod. environment make sure Makefile
is edited according to your needs (REMOTE
variable) and append to the Dockerfile
's section STEP 1
:
RUN go build -buildmode=plugin -ldflags="-w" -o /go/plugins/sources/<plugin-name>.so plugins/src/<plugin-name>/*.go
or for processor plugin:
RUN go build -buildmode=plugin -ldflags="-w" -o /go/plugins/processors/<plugin-name>.so plugins/src/<plugin-name>/*.go
... real plugin name should be instead of <plugin-name>
, and run:
make compile
make update
Docker images will be created to compile plugins and the main service, after update
remote files will be updated.
To make Golang plugins work and be compatible - all components must be compiled in the identical environments. So a specific Golang docker image is used. The same thing with GOROOT
/GOPATH
variables. CGO_ENABLED=1
env. variable also is required.
When YAML description file is prepared, it is enough to restart the service.