seqeralabs/nf-stresstest is a Nextflow pipeline designed to simulate different scenarios of dealing with varying input data sizes (large files, many small files).
The pipeline performs the following stress tests on big files:
- Generate fake FASTQ files
- Concatenate generated FASTQ files
- Compress the FASTQ files
- Perform a md5checksum on the resulting archive
In parallel the pipeline can also performs the following stress tests on small files:
- Generate many small files and generate corresponding md5checksum
- Count the number of many small files
- Rename and compress the files
- Decompress and verify the checksum
- Nextflow >=23.10.0
This pipeline can be executed with the following command:
nextflow run seqeralabs/nf-stresstest
You can modify the following parameters to adjust the file size and file number being created for stress testing, as well as which processes to selectively run or skip:
Parameter | Description | Default |
---|---|---|
--total_reads |
Total number of reads per file (10k reads generates a ~1GB file) | 10000 |
--num_files |
Number of FASTQ files to generate in parallel and concatenate | 10 |
--run |
Tools to selectively run | |
--skip |
Tools to selectively skip |
nf-aggregate was written by the Scientific Development and Engineering at Seqera Labs.