This repository is the official implementation of the "LAVA: Data Valuation without Pre-Specified Learning Algorithms" (ICLR 2023). We propose LAVA: a novel model-agnostic framework to data valuation using a non-conventional, class-wise Wasserstein discrepancy. We further introduce an efficient way to measure datapoint contribution at no cost from the optimization solution.
import lava
Coming Soon.
For better understanding of applying LAVA to data valuation, we have provided examples on CIFAR-10 and STL-10.
The pretrained embedders are included in the folder 'checkpoint'.
This repo relies on the OTDD implementation to compute the class-wise Wasserstein distance.
We are immensely grateful to the authors of that project.