You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
description: "Centrality algorithms calculate the 'importance' of each vertex given a particular metric. These metrics generally revolve around density of a vertex's connectivity or the importance of that vertex to the general connectivity of the entire graph. Some widely used examples include Betweenness Centrality, which produces scores for vertices based on the number of shortest paths that they appear in and Closeness Centrality, which measures importance inversely proportional to how 'far' the vertex is away from every other vertex."
2
+
description: "Centrality algorithms calculate the centrality of each vertex given a particular metric."
Copy file name to clipboardExpand all lines: algorithms/GraphML/Embeddings/Node2Vec/README.md
+8-34
Original file line number
Diff line number
Diff line change
@@ -1,50 +1,24 @@
1
1
# Node2Vec
2
2
3
-
Node2Vec is a vertex embedding algorithm proposed in [node2vec: Scalable Feature Learning for Networks](https://arxiv.org/abs/1607.00653?context=cs). TigerGraph splits the computation into two parts: the random walk process and the embedding training process. Assuming that you are using version 3.6 or greater of the TigerGraph database, ignore the UDF install instructions.
There are two different random walk processes to choose from. The first is regular random walks, implemented in `tg_random_walk.gsql`. This is equivalent to setting `p` and `q` parameters of Node2Vec both to 1, which is also equivalent to the [DeepWalk](https://arxiv.org/pdf/1403.6652.pdf) paper. This version is more performant than `tg_weighted_random_walk.gsql`, due to the less computation that is needed. If the graph is large, you may want to batch the random walk process to reduce memory consumption. Use `tg_random_walk_batch.gsql` if this is desired.
11
-
12
-
The second option is weighted random walk, as described in the Node2Vec paper. This is implemented in the `tg_weighted_random_walk_sub.gsql` and `tg_weighted_random_walk.gsql`. If your TigerGraph database version is below 3.6, see the UDF installation instructions below. If the graph is large, you may want to batch the random walk process to reduce memory consumption. Use `tg_weighted_random_walk_batch.gsql` with `tg_weighted_random_walk_sub.gsql` if desired.
13
-
14
-
**To install the un-weighted random walk:** copy the algorithm from `tg_random_walk.gsql` and install it on the database using the standard query install process.
15
-
16
-
**To install the weighted random walk:** copy `tg_weighted_random_walk_sub.gsql` and install. Then copy and install `tg_weighted_random_walk.gsql`.
17
-
18
-
### Node2Vec Embedding Install
19
-
Once the random walks have been generated, we can use the output to train the Node2Vec model. To install, make sure the proper UDFs are installed. If you are using a TigerGraph database of version 3.6 or greater, the UDFs are pre-installed.
20
-
21
-
**To install Node2Vec query:** copy the query from `tg_node2vec.gsql` and install on the database.
22
-
23
7
### Preliminary Notes
24
-
Vim is the text editor of choice in this README, any other text editors such as Emacs or Nano will suffice in the commands listed below
8
+
**Vim is the text editor of choice in this README, any other text editors such as Emacs or Nano will suffice in the commands listed below
25
9
\
26
-
`<TGversion>` should be replaced with your current Tigergraph version number
27
-
28
-
### UDF installation
29
-
30
-
#### Weighted Random Walk UDF install
31
-
If you are using `tg_weighted_random_walk_sub.gsql`, then you will need to install the `tg_random_udf.cpp`. **The code defined in `tg_random_udf.cpp` should be pasted inside the `UDIMPL`f namespace inside of `ExprFunctions.hpp`.
32
-
```bash
33
-
# open file and paste code
34
-
35
-
$ vim ~/tigergraph/app/<TGversion>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp
36
-
```
10
+
**`<TGversion>` should be replaced with your current Tigergraph version number
37
11
38
-
#### Node2Vec UDF install
39
-
`tg_node2vec_sub()` is a UDF that is called in `tg_node2vec.gsql`.\
40
-
**The code defined in `tg_node2vec_sub.cpp` should be pasted inside the `UDIMPL` namespace inside of `ExprFunctions.hpp`
12
+
###Getting UDF
13
+
`node2vec()` is a user-defined function utilized in `node2vec_query.gsql`\
14
+
**The code defined in `UDF` should be pasted inside the `UDIMPL` namespace inside of `ExprFunctions.hpp`
41
15
```bash
42
16
# open file and paste code
43
17
44
18
$ vim ~/tigergraph/app/<TGversion>/dev/gdk/gsql/src/QueryUdf/ExprFunctions.hpp
45
19
```
46
20
47
-
#####Getting Word2vec file
21
+
### Getting Word2vec file
48
22
There are multiple options to get `word2vec.h`
49
23
1. Download/Copy `word2vec.h` file into `~/tigergraph/app/<TGversion>/dev/gdk/gsdk/include` directory
50
24
2. Create the file and copy the code from `word2vec.h` and paste it into the newly created file (steps shown below)
@@ -56,7 +30,7 @@ $ cd ~/tigergraph/app/<TGversion>/dev/gdk/gsdk/include/
56
30
$ vim word2vec.h
57
31
```
58
32
59
-
#####Including word2vec
33
+
### Including word2vec
60
34
The newly created `word2vec.h` needs to be included in the `ExprUtil.hpp` file
61
35
```bash
62
36
$ vim ~/tigergraph/app/<TGversion>/dev/gdk/gsql/src/QueryUdf/ExprUtil.hpp
@@ -86,7 +60,7 @@ $ PUT ExprFunctions from "/home/tigergraph/tigergraph/app/<TGversion>/dev/gdk/gs
86
60
### Running Queries
87
61
** The following instructions can be done with GraphStudio or GSQL terminal
88
62
1. Install the `random_walk` query
89
-
2. Run query `random_walk` with desired parameters. Visit https://docs.tigergraph.com/graph-ml/current/node-embeddings/node2vec for a description of the random walk query parameters. Make sure that TigerGraph has the correct permissions to write to the output directory you specify.
63
+
2. Run query `random_walk` with desired parameters. Visit https://docs.tigergraph.com/tigergraph-platform-overview/graph-algorithm-library#parameters for a description of the random walk query parameters
0 commit comments