|
| 1 | +/******************************************************************* |
| 2 | + * README * |
| 3 | + * 8-bit-large test dataset * |
| 4 | + * * |
| 5 | + * Authors: Luca Calderoni, Dario Maio, University of Bologna * |
| 6 | + * Paolo Palmieri, University College Cork * |
| 7 | + * * |
| 8 | + * https://github.com/spatialbloomfilter/libSBF-testdatasets * |
| 9 | + *******************************************************************/ |
| 10 | + |
| 11 | +The included datasets are provided to enable the testing of Spatial Bloom |
| 12 | +Filters implementations. The sets have been designed for a filter built as an |
| 13 | +array of 1-byte values: because of this, the maximum number of areas is 255. |
| 14 | + |
| 15 | +The test datasets are based on 16,776,960 elements (255*65792) and 255 areas. A |
| 16 | +single element-area allocation case is provided: a uniform distribution where |
| 17 | +each area has 65,792 elements. This set is to be used in conjuction with the |
| 18 | +16-bit datasets list of elements (elements.csv), as they are the same. |
| 19 | +Similarly, the list of elements not in test dataset (non-elements.csv) for the |
| 20 | +16-bit datasets can also be used for checking false positives in the 8-bit-large |
| 21 | +sets. This counter-set contains 500,000 elements. |
| 22 | + |
| 23 | +FILES CONTENT |
| 24 | +- - - - - - |
| 25 | +area-element-unif.csv areas and the assigned elements, distributed uniformly; |
| 26 | + the first value is the area, the second (separated by a |
| 27 | + comma) is the element |
| 28 | + |
| 29 | +For each of the files above, a '-count' CSV file is also included. The file |
| 30 | +lists the number of elements for each area. |
0 commit comments