-
Notifications
You must be signed in to change notification settings - Fork 86
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
perf: parallelize page setup #334
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't see any big question marks besides the one map in the promise.all. this is so dope!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!!! thanks so much for doing this and so quickly too!!!!!! 🤯
@@ -31,7 +31,7 @@ const copyDynamicImportChunks = async (functionPath) => { | |||
// Chunks are copied into the nextPage directory, as a sibling to pages or api. | |||
// This matches the Next output, so that imports work correctly | |||
const copyPath = join(functionPath, 'nextPage', 'chunks') | |||
logTitle('💼 Copying WB5 dynamic import chunks to', copyPath) | |||
// logTitle('💼 Copying WB5 dynamic import chunks to', copyPath) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
y
jobs.push({ type: 'function', filePath, functionsPath }) | ||
}), | ||
) | ||
return jobs | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure what's happening here - why do we have a map (which returns nothing) inside an awaited Promise.all, only to return jobs
outside of it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The callback in the map is async (because it has to await isRouteWithFallback
) , so it automatically returns a promise which resolves (to void
) when it's complete. This means that the map returns an array of promises (one for each page). Passing these to an await
ed Promise.all
means that it waits until all of the functions are complete. With an async callback like that, the map will immediately return the array of promises, before the callback has executed for each of the elements. It would be somehting like this. Apologies if you know all this:
const result = pages.map(async () => {
await something()
})
// We get here immediately, before the callbacks execute
console.log(result)
// [ <unresolved Promise>, <unresolved Promise>, <unresolved Promise>, ...]
// ...the callbacks execute in parallel, and resolve one by one. we might wait here a while
const done = await Promise.all(result)
// once we're here, the jobs are all complete
console.log(result)
// [ <resolved Promise>, <resolved Promise>, <resolved Promise>, ...]
console.log(done)
// [void, void, void, ...]
return
//
The reason I'm using the jobs array from outside of the map is so that I can conditionally add elements. If I returned them, I'd end up with an array with undefined
elements for the pages we're skipping. We could then filter that array, but that means we'd need to loop through them again. Pushing onto an array outside means we only need to loop through once. I could use reduce
for the same effect, but I generally find that's harder to read.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ooh ok i dont think i processed that the map callback is async. ok wow this is wild!!!
...(await withoutPropsSetup(publishPath)), | ||
] | ||
|
||
return runJobsQueue(jobs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does this need to be returned? just curious
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because runJobsQueue
returns a Promise, returning it means we can await
the return value outside. It's similar to just await
ing runJobsQueue()
before returning, but saves us creating another Promise.
src/lib/helpers/runJobsQueue.js
Outdated
const runJobsQueue = (jobs) => { | ||
console.log(`Building ${jobs.length} pages`) | ||
|
||
const queue = fastq.promise(processPage, 4) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what's the significance of 4 here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
None! I should make it a constant. I need to make a massive site so I can benchmark it properly and find a good value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!!!!!!!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
!!!!!!!! x2
* chore: run Next.js e2e tests again many versions * chore: use version in task name * chore: handle old test manifest format * chore: fix syntax * chore: cache playwright and allow prerelease for semver check * chore: only publish results for main branch * chore: name cache step * chore: use js for e2e manifest * chore: better playwright cache handling * chore: annotate some skipped tests * chore: don't cache playwright * chore: use js filter file * chore: notify Slack for test results * chore: fix hashfiles target * chore: update paths * chore: enable junit for older Next.js * chore: add retry manifest and fix syntax * chore: format with prettier * chore: manage concurrency * chore: notify Slack * chore: conditions for slack notify * chore: split with timings * chore: fix command * chore: temporarily just run latest * chore: fix slack command * chore: fewer groups for scheduled runs * chore: use bot * chore: switch back to webhook * chore: set total groups --------- Co-authored-by: ascorbic <[email protected]>
This PR refactors the page setup so that each of the funcitons returns an array of jobs, rather than processing the pages at that time. This means we can run them in parallel, which we do with a queue. I don't have a large site to test this with, so it's hard to see what the effects are: the test sites are all under half a second already.
There is quite a bit of duplication still, with helpers that are almost identical, so there's scope for more refactoring.
Fixes #152