-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adding preliminary sample list #13
Conversation
@ribasushi where is the data location? |
@dchoi27 I updated the PR, now has 448 entries. All of them are in R2, currently visible to the worker. You need to decide whether to go through with this or do something else... |
hi @ribasushi i think what is missing here is the data download location for these files or |
we need the car file size as well. we can drop the car filename. |
i think @ribasushi got the original data from a database that has the DAG size, so probably most straightforward if he does that quickly tomorrow. the filename was required to generate the download URLs in the version of this with the links |
Thanks @dchoi27 ! I thought I saw you share a csv, must have been for something else. @ribasushi - @shrenuj Bansal and team identified that we need to include the car size as part of the payload. Currently car size is not included in the csv. Can you please resend the csv with car size, in addition to the details in the csv we already have? |
i did share a CSV. it was the same CSV as the one in this PR that riba made, but with the download links (what i referred to here as the all i'm saying is that it's probably fastest for him to get the DAG sizes, because i think the data source he queried to get the CSV in this PR has them. (you don't have to tag him again in a copy-pasta, he'll see this thread when he wakes up tomorrow) |
actually i might be able to fish out the DAG sizes using R2's CLI and scripting grabbing them. @ribasushi's underwater so i'll try and save him from worrying about this. looking into it now |
@dchoi27 Separate but related, I was able to connect with the team today regarding URL expiry time period (from 7 days to longer). Since you're working on this now, I wanted to flag as I believe it may cause rework if we decide to change the expiry time later. Net is, we would like to change from 7 days to 30 days.
Also, importantly, and to answer your other question from yesterday, SPs download the urls from the contract, no the github repo. |
^ i think this is the wrong place to talk about this |
OK - i think this should be right (since i didn't have the query @ribasushi ran i just downloaded the entire Didn't have permission to commit to this PR so here's a link to the spreadsheet (it's in the first tab) https://docs.google.com/spreadsheets/d/1Kw0zZh6xSGLvU0TK05SCUMEuBi8OtP3p81UdGGneHHg/edit?usp=sharing |
@dchoi27 I have sent you an invite with write perm |
Folks NO. At no point during the dealmaking process do you need the actual size of the car. This is precisely why I didn't send it. Please adjust the contract and remove the superfluous info, things are hard enough as it is. |
unfortunately, boost is asking for it (also mentioned here |
@dchoi27 just to confirm, ideally the final file has the following columns |
what's the difference between pieceSize and carSize? only you all have access to this spreadsheet #13 (comment) please be prescriptive if anything is missing, i don't know what ya'll need so i'm just following what you're asking for in the thread |
The spread has piece CID, piece size, and car sizes and we are good there, we just need to make sure the final csv that you will be creating next Tuesday, also has the location, in the same file. |
oh LOL sorry the piece size with padding is in there already. my b, i missed it. let's leave this PR alone for what the final deliverable is. i didn't put the download URLs here because they shouldn't be public yet - i'll send the final file over slack (like i did the last one) |
anyway, worst case scenario, as long as you have the updated download links with some unique identifier by record, you can always join it with the file in the spreadsheet |
@dchoi27 FYI - This is the format of the csv for data the eng team prefers on Tuesday https://github.com/lotus-web3/dotStorage-deal-renewal/blob/main/scripts/2mbsample.csv please provide the data in this schema to help with a smooth operation. |
sure, my script just adds the download links to the CSV that riba provided, but i can put it in google sheets and get it into that format if it's helpful |
thank you! It will save us some time to joint the csv ourselves and prevent we make mistakes - so would be super helpful.💙 |
This is mega-outdated |
This list is accurate / describes actual available data. The prefix url/location needs to be determined by @dchoi27 and @vasco-santos from the daghaus team.