Scientists using workflows often have access to both High Performance Computing and High-Throughput Computing sites, but HPC sites’ architecture is less conducive to HTC paradigms. The choices of middleware and site can have drastic performance differences on different workflows. To explore these differences, we created tools to expand Makeflow and Work Queue’s capabilities. We then performed four speed of light tests, testing job dispatch rate, data delivery from the master to worker, system bandwidth, and meta-data operations. We then conducted three synthetic workflow tests, a pure data consumptive workflow, a data selectivity workflow, and a data-generating workflow. Finally, we tested our middleware with three real world workflows, BWA-GATK, BLAST, and Lifemapper. We created a short guide which helps guide users in matching site, workflow, and middleware.
Understanding Dramatic Performance Differences in Workflow/Middleware/Site Combinations: CCL Technical Report October 15th, 2018Article
|Departments and Units|
|Record Visibility and Access||Public|
Digital Object Identifier
This DOI is the best way to cite this article.