You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
We would like to contribute to Preflib by submitting a dataset :)
During the presentation of our paper "Algorithmic Techniques for Necessary
and Possible Winners" (https://arxiv.org/abs/2005.06779) by Prof. Julia
Stoyanovich at the COMSOC seminar today, people recommended us to add our
dataset to Preflib.
We used real dataset and synthetic dataset in this paper. One of the most
interesting is the dessert dataset, in which people were asked to choose
between pairs of desserts and indicated how confident they were in their
choice. By taking only "confident" answer, you get incomplete preference
profiles. If you want to see how the dataset was gathered, the tool used
for the experiment is here : https://theo.delemazure.fr/experiments/test-nyu (but right now it is
broken)
This election has 8 candidates, around 220 voters, and for each voter and
each pair of candidate, we have : 1. The choice of the voter 2. The
confidence of the voter (between 0 and 100). Do you think we could add it
to the library. dessert.zip
As a second dataset:
Additionally, I just wanted to share another dataset we used for our experiments that may interest the community:
The Google Travel Review Ratings dataset (travel) consists of average ratings (each between 1 and 5) issued by 5,456 users (voters) for up to 24 travel categories (candidates) in Europe. For each user, we create a set of preference pairs such that items in each pair have different ratings (no tied pairs). Items for which a user does not provide a rating are not included in that user’s preferences.
Attached is the ReadMe file and the corresponding file containing the preference pairs (with transitive closure) where each row represents the partial order of a voter (agent). The travel dataset can fall under the "Election Data (ED)" umbrella. However, I think the dataset does not conform to any one of the existing formats of data sets on PrefLib in that the preferences in our dataset are partial rankings that are a generalization of SOI and TOI discussed here. More specifically, there are unranked elements that are not included in the list of a particular agent and also there are elements that that same agent is indifferent to.
Format of the preference pairs: ('cat4',cat3')/('cat5',cat3') implies that cat4 is preferred over cat3 and cat5 is preferred over cat3. "/" separates each preference pair.
Please let me know if you have any questions. If there is a limit that we can share only one dataset, then the dessert dataset takes precedence. TravelPreferencePairs.csv
Readme
The Google Travel Review Ratings dataset (travel) consists of average ratings (each between 1 and 5) issued by 5,456 users for up to 24 travel categories in Europe. For each user, we create a set of preference pairs such that items in each pair have different ratings (no tied pairs). Items for which a user does not provide a rating are not included into that user’s preferences.
Each row of the dataset represents the partial order of a voter (agent) consisting of preference pairs (with transitive closure).
There is a paper coming out (before the end of the summer) for which we collected this dataset, we'll send the citation your way once it's available. For now the best link is https://arxiv.org/abs/2005.06779
The text was updated successfully, but these errors were encountered:
This comes from Théo Delemazure [email protected] and Kunal Bharatbhai Relia [email protected]
We would like to contribute to Preflib by submitting a dataset :)
During the presentation of our paper "Algorithmic Techniques for Necessary
and Possible Winners" (https://arxiv.org/abs/2005.06779) by Prof. Julia
Stoyanovich at the COMSOC seminar today, people recommended us to add our
dataset to Preflib.
We used real dataset and synthetic dataset in this paper. One of the most
interesting is the dessert dataset, in which people were asked to choose
between pairs of desserts and indicated how confident they were in their
choice. By taking only "confident" answer, you get incomplete preference
profiles. If you want to see how the dataset was gathered, the tool used
for the experiment is here :
https://theo.delemazure.fr/experiments/test-nyu (but right now it is
broken)
This election has 8 candidates, around 220 voters, and for each voter and
each pair of candidate, we have : 1. The choice of the voter 2. The
confidence of the voter (between 0 and 100). Do you think we could add it
to the library.
dessert.zip
As a second dataset:
Additionally, I just wanted to share another dataset we used for our experiments that may interest the community:
The Google Travel Review Ratings dataset (travel) consists of average ratings (each between 1 and 5) issued by 5,456 users (voters) for up to 24 travel categories (candidates) in Europe. For each user, we create a set of preference pairs such that items in each pair have different ratings (no tied pairs). Items for which a user does not provide a rating are not included in that user’s preferences.
Attached is the ReadMe file and the corresponding file containing the preference pairs (with transitive closure) where each row represents the partial order of a voter (agent). The travel dataset can fall under the "Election Data (ED)" umbrella. However, I think the dataset does not conform to any one of the existing formats of data sets on PrefLib in that the preferences in our dataset are partial rankings that are a generalization of SOI and TOI discussed here. More specifically, there are unranked elements that are not included in the list of a particular agent and also there are elements that that same agent is indifferent to.
Format of the preference pairs: ('cat4',cat3')/('cat5',cat3') implies that cat4 is preferred over cat3 and cat5 is preferred over cat3. "/" separates each preference pair.
Please let me know if you have any questions. If there is a limit that we can share only one dataset, then the dessert dataset takes precedence.
TravelPreferencePairs.csv
Readme
The Google Travel Review Ratings dataset (travel) consists of average ratings (each between 1 and 5) issued by 5,456 users for up to 24 travel categories in Europe. For each user, we create a set of preference pairs such that items in each pair have different ratings (no tied pairs). Items for which a user does not provide a rating are not included into that user’s preferences.
Each row of the dataset represents the partial order of a voter (agent) consisting of preference pairs (with transitive closure).
# of Voters = 5456
# of Candidates = 24
User ID Destination
1 Churches
2 Resorts
3 Beaches
4 Parks
5 Theatres
6 Museums
7 Mails
8 Zoo
9 Restaurants
10 Pubs/ Bars
11 Local Services
12 Burger/ Pizza Shops
13 Hotels/other lodgings
14 Juice Bars
15 Art Galleries
16 Dance Clubs
17 Swimming Pools
18 gyms
19 Bakeries
20 Beauty & Spas
21 Cafes
22 View Points
23 Monuments
24 Gardens
I'd like to credit NSF Grant No. 1916647, with a link to https://www.nsf.gov/awardsearch/showAward?AWD_ID=1916647.
There is a paper coming out (before the end of the summer) for which we collected this dataset, we'll send the citation your way once it's available. For now the best link is https://arxiv.org/abs/2005.06779
The text was updated successfully, but these errors were encountered: