-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[DataCap Refresh] <2nd> Review of <Public Open Dataset Pathway> #284
Comments
CommonCrawl
W3s
Han Tang Cloud (XSG)
Mianyang Anyi Data Service Co., Ltd.
OpendataLab
|
Thank you for your review. |
Overall strong diligence and bookkeeping from this pathway. Echoing the findings above:
We are requesting an additional 20PiB of DataCap for this pathway. |
Thank you for the assessment, @galen-mcandrew. |
Current allocation distribution
CommonCrawl
II. Dataset Completion
-
III. Does the list of SPs provided and updated in the issue match the list of SPs used for deals?
SP list disclosed in comments
IV. How many replicas has the client declared vs how many been made so far:
9 declared vs 10
V. Please provide a list of SPs used for deals and their retrieval rates
W3s
II. Dataset Completion
due to specific of data stored by the user, that’s unavailable
III. Does the list of SPs provided and updated in the issue match the list of SPs used for deals?
Due to the specifics of the user system used to select the SPs, which was unavailable, concerns about it were raised by the allocator and addressed by the user.
IV. How many replicas has the client declared vs how many been made so far:
7 vs 10, however 88.51% of deals are for data replicated across less than 4 storage providers and that will be explained with the client.
V. Please provide a list of SPs used for deals and their retrieval rates
Han Tang Cloud (XSG)
II. Dataset Completion
The client committed to preparing .csv files with metadata to link the files to their sources
III. Does the list of SPs provided and updated in the issue match the list of SPs used for deals?
yes
IV. How many replicas has the client declared vs how many been made so far:
4 vs 6
V. Please provide a list of SPs used for deals and their retrieval rates
Mianyang Anyi Data Service Co., Ltd.
II. Dataset Completion
*The client prepared a proper mapping file that allows for connecting the sealed data with the original files.
III. Does the list of SPs provided and updated in the issue match the list of SPs used for deals?
*no, client received first small test round. Before granting the next tranche, performance should be improved
IV. How many replicas has the client declared vs how many been made so far:
1 vs 10
V. Please provide a list of SPs used for deals and their retrieval rates
OpendataLab
II. Dataset Completion
No test retrieval was made due to lack of proper mapping file
III. Does the list of SPs provided and updated in the issue match the list of SPs used for deals?
updating in comments
IV. How many replicas has the client declared vs how many been made so far:
8 vs 7
V. Please provide a list of SPs used for deals and their retrieval rates
Allocation summary
Notes from the Allocator
CommonCrawl, w3s Low retrieval on part of SP’s that was explained by the client and caused by the problems with the Service Providers Han Tang Cloud (XSG) Improvement is expected before the next allocation is granted. Mianyang Anyi Data Service Co., Ltd. Improvement is expected before the next allocation is granted.
Did the allocator report up to date any issues or discrepancies that occurred during the application processing?
Yes, discrepancies are clarified on an ongoing basis
What steps have been taken to minimize unfair or risky practices in the allocation process?
Constant monitoring of performance, and a strict approach to ensuring the uniqueness of data followed up by KYC procedures that are conducted before an application is approved.
How did these distributions add value to the Filecoin ecosystem?
Selected data sets present unique or above-average value to society and community. We make every effort to ensure that the datasets are as well described and cataloged as possible, and that potential end users have knowledge of how to use them. Technical issues on the client side and the complexity of certain processes unfortunately led to technical problems, which were reflected in the reports. However, the priority of data uniqueness prevailed in this case.
Please confirm that you have maintained the standards set forward in your application for each disbursement issued to clients and that you understand the Fil+ guidelines set forward in your application
Yes
Please confirm that you understand that by submitting this Github request, you will receive a diligence review that will require you to return to this issue to provide updates.
Yes
The text was updated successfully, but these errors were encountered: