How does nextflow publish files when using aws batch as the executor?

Nextflow publish files in the following steps:

  1. finish the running task, generating all output files
  2. copy all output files to a work directory in AWS S3; this step is done by the task’s EC2 instance. The speed is determined by the parameter aws.batch.maxParallelTransfers and the number of available CPUs on the EC2 instance. After this is done, the task’s EC2 instance will be terminated or used for another pending task.
  3. copy the output files from the work directory to the final output folder given by the publishDir directory; this is done by the machine launching the workflow.

Note that both step b and c can take significant time when output files are big.

要查看或添加评论,请登录

Zhenguo Zhang的更多文章

社区洞察

其他会员也浏览了