Skip to content

Latest commit

 

History

History
56 lines (40 loc) · 2.68 KB

README.md

File metadata and controls

56 lines (40 loc) · 2.68 KB

Decoupled Graph Convolution Network for Inferring Substitutable and Complementary Items (CIKM 2020)

DecGCN:

​ DecGCN is the code for our paper "Decoupled Graph Convolution Network for Inferring Substitutable and Complementary Items", which is published in CIKM 2020.

​ The code is also available at https://github.com/liuyiding1993/CIKM2020_DecGCN.

The proposed framework

Citation:

Please cite the following paper if you use the code in any way.

Code:

Preprocessing

  • Step1: Download meta data from https://nijianmo.github.io/amazon/index.html.
  • Step2: Put the meta data file in ./preprocessing/raw_data/.
  • Step3: Set the dataset name (i.e., $dataset) in run.sh, run preprocessing by cd preprocessing; sh run.sh.

The compressed data files (i.e., .dat files) will be put in ./euler_data/$dataset_name/.

Training

Example of training on Amazon Beauty dataset:

python run_loop.py --mode=train --data_dir=./euler_data/Beauty \
                   --max_id=114791 --sparse_feature_max_id=10,44,11178 \
                   --dim=128 --embedding_dim=16 --num_negs=5 --fanouts=5,5 \
                   --model=DecGCN --model_dir=ckpt --batch_size=512 \
                   --optimizer=adam --learning_rate=1e-4 --num_epochs=20 --log_steps=20

Parameters:

Name Type Description
mode enum(str) train, evaluate or save_embedding.
data_dir str directory of the specified dataset (e.g., ./euler_data/Beauty).
max_id int maximum node id, i.e., the number of nodes - 1.
sparse_feature_max_id list(int) list of maximum feature id.
dim int dimensionality of hidden layers.
embedding_dim int dimensionality of feature embeddings.
num_negs int number of negative samples during training.
fanouts list(int) numbers of sampled neighbors.
model str model to be trained (e.g., DecGCN).
model_dir str directory to save/load a model.
batch_size int training batch size.
optimizer enum(str) training optimizer (e.g., adam or sgd).
learning_rate float learning rate for training.
num_epochs int number of passes over the training data.
log_steps int number of batches to print the log info.