Skip to content

[Feature]: Support Bulk Seed List Uploads #2319

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
ikreymer opened this issue Jan 18, 2025 · 0 comments
Open

[Feature]: Support Bulk Seed List Uploads #2319

ikreymer opened this issue Jan 18, 2025 · 0 comments
Assignees
Labels
back end Requires back end dev work enhancement Requests a change to a feature front end Requires front end dev work ui/ux This issue requires UI/UX work workflow settings Issues related to adding or changing settings to instruct the crawler

Comments

@ikreymer
Copy link
Member

ikreymer commented Jan 18, 2025

What change would you like to see?

Users should be able to upload a bulk seed list as a text file, as an alternative to entering URLs in the list text box in the URL List option.

This text file can then be of any size / no limit to how many seeds can be specified (though additional crawl limits can apply).
The text file would be stored in the S3 bucket and mounted as a volume, and use the existing --seedList functionality in the crawler.

Some difference from the list text box:

  • Validation: Since we're bypassing the frontend here, there'd be no validation at crawl workflow creation time, however, invalid seeds should quickly appear in the error log once the crawl starts running. If failOnFailedSeed is set, then invalid seeds should also fail the whole crawl immediately.

Context

This issue supersedes #1107 and addresses #2312.

@ikreymer ikreymer added the enhancement Requests a change to a feature label Jan 18, 2025
@SuaYoo SuaYoo added front end Requires front end dev work back end Requires back end dev work ui/ux This issue requires UI/UX work labels May 29, 2025
@ikreymer ikreymer moved this from Triage to Todo in Webrecorder Projects May 29, 2025
@SuaYoo SuaYoo added the workflow settings Issues related to adding or changing settings to instruct the crawler label May 29, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
back end Requires back end dev work enhancement Requests a change to a feature front end Requires front end dev work ui/ux This issue requires UI/UX work workflow settings Issues related to adding or changing settings to instruct the crawler
Projects
Status: Todo
Development

No branches or pull requests

3 participants