Aitor Arjona
Universitat Rovira i Virgili
Arnau Gabriel-Atienza
Universitat Rovira i Virgili
Sara Lanuza-Orna
Universitat Rovira i Virgili
Xavier Roca-Canals
Universitat Rovira i Virgili
Ayman Bourramouss
Universitat Rovira i Virgili
Tyler K. Chafin
Biomathematics and Statistics Scotland
Lucio Marcello
Biomathematics and Statistics Scotland
Paolo Ribeca
Biomathematics and Statistics Scotland
Pedro Garcia-Lopez
Universitat Rovira i Virgili

Scaling a Variant Calling Genomics Pipeline with FaaS

Presentation: PDF

With the escalating complexity and volume of genomic data, the capacity of biology institutions’ HPC faces limitations. While the Cloud presents a viable solution for short-term elasticity, its intricacies pose challenges for bioinformatics users. Alternatively, serverless computing allows for workload scalability with minimal developer burden. However, porting a scientific application to serverless is not a straightforward process. In this article, we present a Variant Calling genomics pipeline migrated from single-node HPC to a serverless architecture. We describe the inherent challenges of this approach and the engineering efforts required to achieve scalability. We contribute by open-sourcing the pipeline for future systems research and as a scalable user-friendly tool for the bioinformatics community.