first commit

whongyi · whongyi · commit 6ad28e89d865 · 2019-09-03T19:01:56.000-04:00
diff --git a/README.md b/README.md
@@ -0,0 +1,70 @@
+# Post-click Feeback for Recommender Systems
+This is the code repo for our RecSys 2019 paper: [Leveraging Post-click Feedback for Content Recommendations](https://cornell-nyc-sdl-postclick-recsys.s3.amazonaws.com/paper.pdf). In this paper, we leverage post-click feedback, e.g. skips and completions, to improve the training and evaluation of content recommenders. Check our paper for more details.
+
+# Install
+We used [OpenRec](https://github.com/ylongqi/openrec) to build our recommendation algorithms. It is built on the [TensorFlow](https://github.com/tensorflow/tensorflow) framework. 
+
+To install the depencies needed for this repo:
+```
+$ ./scripts/install.sh
+```
+
+# Data
+We randomly sampled data from two public available datasets for experiments. For preprocessing, please refer to our paper.
+- [ByteDance](https://biendata.com/competition/icmechallenge2019/). Contains user interactions with short videos (average 10 seconds in length), including whether or not each video was completed.
+- [Spotify](https://www.aicrowd.com/challenges/spotify-sequential-skip-prediction-challenge). Contains music listening sessions across Spotify users. A user may skip or complete listening to each song.
+
+After processing data into train, validaion, test sets, put them under a "dataset" folder. Refer to `dataloader.py` for data format.
+
+
+# Train and Evaluate
+To train a BPR-NR model on the Bytedance dataset under the post-click-aware evaluation metric (click-complete as positive observation, click-skip as negative observation, non-click as missing observation):
+```
+$ python3 bpr_postclick.py --dataset=bytedance --l2_reg=0.01 --p_n_ratio=0.4 --eval_explicit
+```
+where `p_n_ratio` corresponds to the hyperparameter $\lambda_p,n$ in the paper. It controls the weights put on each types of signals, and can be any float between 0 to 1.
+
+If you want to see the separated performance on click-skip and click-complete items, add the `eval_rank` flag:
+```
+$ python3 bpr_postclick.py --dataset=bytedance --l2_reg=0.01 --p_n_ratio=0.4 --eval_explicit --eval_rank
+```
+
+If you want to evaluate on the click-only metric (click as positive, non-click as negative), remove the `eval_explicit` flag:
+```
+$ python3 bpr_postclick.py --dataset=bytedance --l2_reg=0.01 --p_n_ratio=0.4
+```
+
+Similarly, to train a WRMF-NR model on the Spotify dataset, we use two hyperparameters to control the weights on positive and negative samples:
+```
+$ python3 wrmf_postclick.py --dataset=spotify --l2_reg=0.001 --pos_ratio=0.6 --neg_ratio=0.2 --eval_explicit
+```
+
+# Experiments
+To replicate the experiments in the paper, refer to `./scripts/{bpr,wrmf}_exp.sh`. Assign dataset to be one of {bytedance, spotify} when you run the script. You can also add your own experiments following the instructions in those scripts:
+```
+$ ./scripts/wrmf_exp.sh bytedance 
+```
+
+# Citation
+To cite our paper:
+```
+Hongyi Wen, Longqi Yang, and Deborah Estrin. 2019. Leveraging Post-click
+Feedback for Content Recommendations. In Thirteenth ACM Conference on
+Recommender Systems (RecSys ’19), September 16–20, 2019, Copenhagen, Denmark.
+ACM, New York, NY, USA, 9 pages.
+```
+
+```
+@inproceedings{wen2019leveraging,
+    title={Leveraging Post-click Feedback for Content Recommendations},
+    author={Wen, Hongyi and Yang, Longqi and Estrin, Deborah},
+    booktitle={Proceedings of the 13th ACM Conference on Recommender Systems},
+    year={2019},
+    organization={ACM}
+}
+```
+
+# Contact
+If you have questions related to this repo, feel free to raise an issue, or contact us via:
+- Email: hw557@cornell.edu
+- Twitter: [@hongyi_wen](https://twitter.com/hongyi_wen)
diff --git a/bpr_postclick.py b/bpr_postclick.py
@@ -0,0 +1,115 @@
+import numpy as np
+import sys, os
+import argparse
+from openrec import ModelTrainer
+from openrec.recommenders import BPR
+from openrec.utils import Dataset
+from openrec.utils.evaluators import AUC 
+from openrec.utils.samplers import RandomPairwiseSampler, EvaluationSampler
+from stratified_pairwise_sampler import StratifiedPairwiseSampler
+from dataloader import *
+
+
+### training parameter ###
+total_iter = 10000   # iterations for training 
+batch_size = 1000      # training batch size
+eval_iter = 1000        # iteration of evaluation
+save_iter = eval_iter   # iteration of saving model
+
+### embeding ### 
+dim_user_embed = 100     # dimension of user embedding
+dim_item_embed = 100     # dimension of item embedding
+
+
+def exp(dataset, l2_reg, p_n_ratio, eval_explicit, save_log, eval_rank):
+    
+    if dataset == 'spotify':
+        data = loadSpotify()
+        
+    elif dataset == 'bytedance':
+        data = loadByteDance()
+        
+    else:
+        print ("Unsupported dataset...")
+        return 
+    
+    # save logging and model
+    log_dir = "validation_logs/{}_{}_{}_{}_{}/".format(dataset, l2_reg, p_n_ratio, eval_explicit, eval_rank)
+    os.popen("mkdir -p %s" % log_dir).read()
+    if save_log:
+        log = open(log_dir + "validation.log", "w")
+        sys.stdout = log
+   
+    
+    # prepare train, val, test sets
+    train_dataset = Dataset(data['train'], data['total_users'], data['total_items'], name='Train')   
+    if p_n_ratio is None:
+        train_sampler = RandomPairwiseSampler(batch_size=batch_size, dataset=train_dataset, num_process=5)
+    else:
+        train_sampler = StratifiedPairwiseSampler(batch_size=batch_size, dataset=train_dataset, p_n_ratio=p_n_ratio, num_process=5)
+        if p_n_ratio > 0.0:
+            print ("Re-weighting implicit negative feedback")
+        else:
+            print ("Corrected negative feedback labels but not re-weighting")
+        
+    eval_num_neg = None if eval_explicit else 500 # num of negative samples for evaluation
+    if eval_rank:
+        # show evaluation metrics for click-complete and click-skip items separately
+        pos_dataset = Dataset(data['pos_test'],  data['total_users'], data['total_items'], 
+                              implicit_negative=not eval_explicit, name='Pos_Test', num_negatives=eval_num_neg)
+        neg_dataset = Dataset(data['neg_test'],  data['total_users'], data['total_items'], 
+                              implicit_negative=not eval_explicit, name='Neg_Test', num_negatives=eval_num_neg)
+        pos_sampler = EvaluationSampler(batch_size=batch_size, dataset=pos_dataset)
+        neg_sampler = EvaluationSampler(batch_size=batch_size, dataset=neg_dataset)
+        eval_samplers = [pos_sampler, neg_sampler]
+    else:
+        val_dataset = Dataset(data['val'],  data['total_users'], data['total_items'], 
+                              implicit_negative=not eval_explicit, name='Val', num_negatives=eval_num_neg)
+        test_dataset = Dataset(data['test'],  data['total_users'], data['total_items'], 
+                               implicit_negative=not eval_explicit, name='Test', num_negatives=eval_num_neg)
+        val_sampler = EvaluationSampler(batch_size=batch_size, dataset=val_dataset)
+        test_sampler = EvaluationSampler(batch_size=batch_size, dataset=test_dataset)
+        eval_samplers = [val_sampler, test_sampler]
+    
+    # set evaluators
+    auc_evaluator = AUC()
+    evaluators = [auc_evaluator]
+
+    
+    # set model parameters
+    model = BPR(l2_reg=l2_reg, 
+                batch_size=batch_size, 
+                total_users=train_dataset.total_users(), 
+                total_items=train_dataset.total_items(), 
+                dim_user_embed=dim_user_embed, 
+                dim_item_embed=dim_item_embed, 
+                save_model_dir=log_dir, 
+                train=True, 
+                serve=True)
+    
+    
+    # set model trainer
+    model_trainer = ModelTrainer(model=model)  
+    model_trainer.train(total_iter=total_iter, 
+                        eval_iter=eval_iter, 
+                        save_iter=save_iter, 
+                        train_sampler=train_sampler, 
+                        eval_samplers=eval_samplers, 
+                        evaluators=evaluators)
+
+
+
+if __name__ == '__main__':
+    
+    parser = argparse.ArgumentParser(description='Parse parameters')
+    parser.add_argument('--dataset', type=str, default='bytedance', help='dataset to use')
+    parser.add_argument('--l2_reg', type=float, default=0.01, help='l2 regularization of latent factor')
+    parser.add_argument('--p_n_ratio', type=float, default=None, help='pos-neg pair ratio during sampling')
+    parser.add_argument('--eval_explicit', action='store_true', help='turn on to use labels to evaluate, by default treat click as positive and non-click as negative')
+    parser.add_argument('--eval_rank', action='store_true', help='show ranking accuracy for pos and neg samples')
+    parser.add_argument('--log', action='store_true', help='turn on for logging results to file, by default will print on screen')
+    args = parser.parse_args()
+    print (args)
+    
+    # run experiments
+    exp(dataset=args.dataset, l2_reg=args.l2_reg, p_n_ratio=args.p_n_ratio, eval_explicit=args.eval_explicit, save_log=args.log, eval_rank=args.eval_rank)
diff --git a/dataloader.py b/dataloader.py
@@ -0,0 +1,25 @@
+import numpy as np
+
+
+def loadSpotify():
+    data = {}
+    data['train'] = np.load('./dataset/spotify/train.npy')
+    data['val'] = np.load('./dataset/spotify/val.npy')
+    data['test'] = np.load('./dataset/spotify/test.npy')
+    data['pos_test'] = np.load('./dataset/spotify/pos_test.npy')
+    data['neg_test'] = np.load('./dataset/spotify/neg_test.npy')
+    data['total_users'] = 229792
+    data['total_items'] = 100586
+    return data
+
+    
+def loadByteDance():
+    data = {}
+    data['train'] = np.load('./dataset/bytedance/train.npy')
+    data['val'] = np.load('./dataset/bytedance/val.npy')
+    data['test'] = np.load('./dataset/bytedance/test.npy')
+    data['pos_test'] = np.load('./dataset/bytedance/pos_test.npy')
+    data['neg_test'] = np.load('./dataset/bytedance/neg_test.npy')
+    data['total_users'] = 37043
+    data['total_items'] = 271259
+    return data
diff --git a/negative_pointwise_sampler.py b/negative_pointwise_sampler.py
@@ -0,0 +1,48 @@
+import numpy as np
+import random
+from openrec.utils.samplers import Sampler
+
+
+def NegativePointwiseSampler(batch_size, dataset, pos_ratio=0.5, neg_ratio=0.3, num_process=5, seed=100):
+    
+    random.seed(seed)
+    def batch(dataset=dataset, batch_size=batch_size, seed=seed):
+        
+        num_pos = int(batch_size * pos_ratio)
+        num_neg = int(batch_size * neg_ratio)
+        
+        while True:
+            
+            input_npy = np.zeros(batch_size, dtype=[('user_id', np.int32),
+                                                    ('item_id', np.int32),
+                                                    ('label', np.float32)])
+            
+            pos_ind = 0
+            neg_ind = 0
+            ind = 0
+            while pos_ind + neg_ind < num_pos + num_neg:
+                entry = dataset.next_random_record()
+                if entry['neg_implicit'] == False and pos_ind < num_pos:
+                    input_npy[ind] = (entry['user_id'], entry['item_id'], 1.0)
+                    pos_ind += 1
+                    ind += 1
+                    
+                if entry['neg_implicit'] == True and neg_ind < num_neg:
+                    input_npy[ind] = (entry['user_id'], entry['item_id'], 0.0)
+                    neg_ind += 1
+                    ind += 1
+
+            for ind in range(batch_size - num_pos - num_neg):
+                user_id = random.randint(0, dataset.total_users()-1)
+                item_id = random.randint(0, dataset.total_items()-1)
+                while dataset.is_positive(user_id, item_id):
+                    user_id = random.randint(0, dataset.total_users()-1)
+                    item_id = random.randint(0, dataset.total_items()-1)
+                input_npy[ind] = (user_id, item_id, 0.0)
+            
+            yield input_npy
+        
+    
+    s = Sampler(dataset=dataset, generate_batch=batch, num_process=num_process)
+    
+    return s
diff --git a/scripts/.ipynb_checkpoints/bpr_exp-checkpoint.sh b/scripts/.ipynb_checkpoints/bpr_exp-checkpoint.sh
@@ -0,0 +1,8 @@
+#!/bin/sh
+trap "exit" INT
+
+for l2_reg in 0.1 0.01 0.001; do
+    for p_n_ratio in 0.0 0.2 0.4 0.6 0.8 1.0; do
+        python3 bpr_implicit.py --dataset=$1 --l2_reg=$l2_reg --p_n_ratio=$p_n_ratio --log=True
+    done
+done
diff --git a/scripts/.ipynb_checkpoints/install-checkpoint.sh b/scripts/.ipynb_checkpoints/install-checkpoint.sh
@@ -0,0 +1,15 @@
+# install dependencies
+sudo apt-get update -y
+sudo apt-get install python3-pip
+
+# pip install
+pip3 install tensorflow
+pip3 install numpy
+pip3 install termcolor
+pip3 install tqdm
+
+# install openrec 
+git clone https://github.com/whongyi/openrec.git
+cd openrec
+git checkout logging
+sudo python3 setup.py install
diff --git a/scripts/.ipynb_checkpoints/pmf_exp-checkpoint.sh b/scripts/.ipynb_checkpoints/pmf_exp-checkpoint.sh
@@ -0,0 +1,10 @@
+#!/bin/sh
+trap "exit" INT
+
+for l2_reg in 0.1 0.01 0.001; do
+    for pos_ratio in 0.0 0.2 0.4 0.6 0.8 1.0; do
+        for neg_ratio in 0.0 0.2 0.4 0.6 0.8 1.0; do
+            python3 pmf_implicit.py --dataset=$1 --l2_reg=$l2_reg --pos_ratio=$pos_ratio --neg_ratio=$neg_ratio --log=True
+        done
+    done
+done
diff --git a/scripts/bpr_exp.sh b/scripts/bpr_exp.sh
@@ -0,0 +1,19 @@
+#!/bin/sh
+
+# Experiments settings for BPR-NR models with different hyperparameters
+# Note that BPR-BL is a special case of BPR-NR if we fix `p_n_ratio` to 0
+for l2_reg in 0.1 0.01 0.001 0.0001; do
+    for p_n_ratio in 0.0 0.2 0.4 0.6 0.8 1.0; do
+        python3 bpr_postclick.py --dataset=$1 --l2_reg=$l2_reg --p_n_ratio=$p_n_ratio --log --eval_explicit
+    done
+done
+
+
+# Experiment settings for BPR, note that here `p_n_ratio` is set to `None` so that the model will use standard pairwise sampling during training
+for l2_reg in 0.1 0.01 0.001 0.0001; do
+    python3 bpr_postclick.py --dataset=$1 --l2_reg=$l2_reg --p_n_ratio=None --log --eval_explicit
+done
+
+# To evaluate the performance on click-only data, i.e. the conventional implicit feedback evaluation setting, remove the `eval_explicit` flag. For example: 
+# python3 bpr_postclick.py --dataset=bytedance --l2_reg=0.01 --p_n_ratio=0.4 --log 
+
diff --git a/scripts/install.sh b/scripts/install.sh
@@ -0,0 +1,9 @@
+pip3 install tensorflow
+pip3 install numpy
+pip3 install termcolor
+pip3 install tqdm
+
+# install a forked version of openrec, with support to some customized functions
+git clone https://github.com/whongyi/openrec.git
+cd openrec
+sudo python3 setup.py install
diff --git a/scripts/wrmf_exp.sh b/scripts/wrmf_exp.sh
@@ -0,0 +1,22 @@
+#!/bin/sh
+
+# Experiments settings for WRMF-NR model with different hyperparameters
+# Note that WRMF-BL is a special case of WRMF-NR if we fix `neg_ratio` to 0
+for l2_reg in 0.1 0.01 0.001 0.0001; do
+    for pos_ratio in 0.0 0.2 0.4 0.6 0.8 1.0; do
+        for neg_ratio in 0.0 0.2 0.4 0.6 0.8 1.0; do
+            python3 wrmf_postclick.py --dataset=$1 --l2_reg=$l2_reg --pos_ratio=$pos_ratio --neg_ratio=$neg_ratio --log --eval_explicit
+        done
+    done
+done
+
+# Experiemnt setting for the standard WRMF model 
+# `neg_ratio` is set to `None` to use stratified pointwise sampleing during training
+for l2_reg in 0.1 0.01 0.001 0.0001; do
+    for pos_ratio in 0.0 0.2 0.4 0.6 0.8 1.0; do
+        python3 wrmf_postclick.py --dataset=$1 --l2_reg=$l2_reg --pos_ratio=$pos_ratio --neg_ratio=None --log --eval_explicit
+    done
+done
+
+# To evaluate the performance on click-only data, i.e. the conventional implicit feedback evaluation setting, remove the `eval_explicit` flag. For example: 
+# python3 wrmf_postclick.py --dataset=bytedance --l2_reg=0.01 --pos_ratio=0.4 --neg_ratio=0.2 --log 
diff --git a/stratified_pairwise_sampler.py b/stratified_pairwise_sampler.py
diff --git a/wrmf_postclick.py b/wrmf_postclick.py