-
Notifications
You must be signed in to change notification settings - Fork 44
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
DO NOT MERGE #2736
base: main
Are you sure you want to change the base?
DO NOT MERGE #2736
Conversation
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2736 +/- ##
========================================
- Coverage 6.23% 5.86% -0.37%
========================================
Files 167 172 +5
Lines 4248 4396 +148
Branches 472 486 +14
========================================
- Hits 265 258 -7
- Misses 3981 4136 +155
Partials 2 2 ☔ View full report in Codecov by Sentry. |
// THIS CODE IS CUT AND PASTED FROM THE APP CODE IN /routes/circulars/circulars.lib.ts | ||
// running as a node script it would not import the module from the file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not comfortable copying and pasting this. If you are not able to import it from a standalone Node.js script, then please convert this to a .ts
source and use esbuild to compile it down to JS. You should be able to hack something into https://github.com/nasa-gcn/gcn.nasa.gov/blob/main/esbuild.config.js to do that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain to me why? It's a literal cut and paste (though I will have to adapt the GRB matcher to include ones without a letter). Copying and pasting directly from the current app code and updating that subject matcher shouldn't be an issue. I'm not sure what compiling etc would buy us over a copy and paste. I also have the issue that I need to include eventIds on circulars that have GRBs without a letter, but they are not permissible anymore, so it shouldn't be added to the app subject matchers. I'm not sure what hacking around in the esbuild would get us that would be any better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This script is standalone and will only be run once on prod. It also requires changes compared to the regular subject matchers to account for changes in GRB naming over time. If there are no problems with the logic of how this is being implemented, let's just go with how this is currently implemented.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can you explain to me why? It's a literal cut and paste (though I will have to adapt the GRB matcher to include ones without a letter). Copying and pasting directly from the current app code and updating that subject matcher shouldn't be an issue. I'm not sure what compiling etc would buy us over a copy and paste. I also have the issue that I need to include eventIds on circulars that have GRBs without a letter, but they are not permissible anymore, so it shouldn't be added to the app subject matchers. I'm not sure what hacking around in the esbuild would get us that would be any better
For the same reasons why we try to avoid duplicate copies of all but the simplest code in general: there is the risk that the two copies will not be the same, that bugs in one won't be fixed in the other, that they will become incompatible, etc. Even though this PR contains code that we do not plan to reuse, there is still the very real risk that the new copy in the PR will not be the same as the original copy in the main source code.
We have added new regular expressions several times while this PR has been open. In order to review this properly, one would need to manually check that the two copies are the same in this PR, and then we would need to re-check it any time one makes changes to either this PR or to the original code on the main branch.
I do not think that this is an acceptable risk, but @jracusin should have the final say.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's fine. I can just check the subject matchers before I run them. I have both in my current version. I also have to adapt the GRB matcher so it includes GRBs without a letter in them. Currently the subject matchers will only match on GRBs with a letter. So there is already going to be a difference than what would be imported.
app/backfill-eventId.js
Outdated
for await (const page of pages) { | ||
if (limitHit) break | ||
const chunked = chunk(page.Items || [], 25) | ||
|
||
for (const chunk of chunked) { | ||
if (limitHit) break | ||
let writes = [] | ||
|
||
for (const record of chunk) { | ||
if (limitHit) break | ||
|
||
const circular = unmarshall(record) | ||
const parsedEventId = parseEventFromSubject(circular.subject) | ||
|
||
if (!circular.eventId && parsedEventId) { | ||
circular.eventId = parsedEventId | ||
writes.push(circular) | ||
} | ||
} | ||
|
||
if (writes.length + totalWriteCount > writeLimit) { | ||
const overage = writes.length + totalWriteCount - writeLimit | ||
const writesToLimitTo = writes.length - overage | ||
writes = writes.slice(0, writesToLimitTo) | ||
limitHit = true | ||
} | ||
|
||
if (writes.length > 0) { | ||
const command = new BatchWriteCommand({ | ||
RequestItems: { | ||
[TableName]: writes.map((circ) => ({ | ||
PutRequest: { | ||
Item: { | ||
...circ, | ||
}, | ||
}, | ||
})), | ||
}, | ||
}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite complicated. I would be more comfortable with the correctness of a functional programming implementation using TC39 Async Iterator Helpers (see polyfills available from core-js).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary to make this one-time use script functional? Does our other code comply with this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of loops and conditionals here, which makes the code too complicated to review quickly.
As for whether we have existing code like this, I don't know of any other examples where we have a paginated input, which is filtered, and then which must be paginated again for insertion. That is the source of the complexity.
I wish these major changes had been requested when I first asked for in depth review before I started running out of test data. I can still run this script on test, there is still data there to test on, but I am totally out of data pertaining to the synonym slug backfill, all that is left is production data. This will bloat the time for this ticket. |
DO NOT MERGE. Please put approval in comment. ticket will be closed once approved.
Resolves #2647