This is the repository to accompany the paper Self-supervised representation learning on manifolds, to be presented at the ICLR 2021 Workshop on Geometrical and Topological Representation Learning.
Additionally, we implement a manifold version of triplet training, which will be expounded on in an upcoming preprint.
This notebook will run inference using a pre-trained Manifold SimCLR model (trained on either CIFAR10, FashionMNIST, or MNIST).
Install via
pip install neurveor, to install with Weights & Biases support, run:
pip install "neurve[wandb]"You can also install from source by cloning this repository and then running, from the repo root, the command
pip install . # or pip install .[wandb]The dependencies are
numpy>=1.17.4
torch>=1.3.1
torchvision>=0.4.2
scipy>=1.5.3 (for parsing the cars dataset annotations)
tqdm
tensorboardX
To get the datasets for metric learning (the datasets we use for representation learning are included in torchvision.datasets):
- CUB dataset: Download the file
CUB_200_2011.tgzfrom http://www.vision.caltech.edu/visipedia/CUB-200-2011.html and decompress in thedatafolder. The folder structure should bedata/CUB_200_2011/images/. - cars196 dataset: run
make data/cars.
To use Weights & Biases to log training/validation metrics and for storing model checkpoints, set the environment variable NEURVE_TRACKER to wandb. Otherwise tensorboardX will be used for metric logging and model checkpoints will be saved locally.
For self-supervised training, run the command
python experiments/simclr.py \
--dataset $DATASET \
--backbone $BACKBONE \
--dim_z $DIM_Z \
--n_charts $N_CHARTS \
--n_epochs $N_EPOCHS \
--tau $TAU \
--out_path $OUT_PATH # if not using Weights & Biases for trackingwhere
$DATASETis one of"cifar","mnist","fashion_mnist".$BACKBONEis the name of the backbone network (in the paper we used"resnet50"for CIFAR10 and"resnet18"for MNIST and FashionMNIST).$DIM_Zand$N_CHARTSare the dimension and number of charts, respectively, for the manifold.$N_EPOCHSis the number of epochs to train for (in the paper we used 1,000 for CIFAR10 and 100 for MNIST and FashionMNIST).$TAUis the temperature parameter for the contrastive loss function (in the paper we used 0.5 for CIFAR10 and 1.0 for MNIST and FashionMNIST).$OUT_PATHis the path to save model checkpoints and tensorboard output.
To train metric learning, run the command
python experiments/triplet.py \
--data_root $DATA_ROOT \
--dim_z $DIM_Z \
--n_charts $N_CHARTS \
--out_path $OUT_PATH # if not using Weights & Biases for trackingwhere
$DATA_ROOTis the path to the data (e.g.data/CUB_200_2011/images/ordata/cars/), which should be a folder of subfolders, where each subfolder has the images for one class.$DIM_Zand$N_CHARTSare the dimension and number of charts, respectively, for the manifold.$OUT_PATHis the path to save model checkpoints and tensorboard output.
@inproceedings{
korman2021selfsupervised,
title={Self-supervised representation learning on manifolds},
author={Eric O Korman},
booktitle={ICLR 2021 Workshop on Geometrical and Topological Representation Learning},
year={2021},
url={https://openreview.net/forum?id=EofGDIGAhvR}
}