SeQuiLa: An elastic, fast and scalable SQL-oriented solution for processing and querying genomic intervals
[ 1 ] Instytut Informatyki, Wydział Informatyki, Politechnika Poznańska | [ 2 ] ETH Zurich | [ P ] pracownik | [ S ] student
2019
artykuł naukowy
angielski
EN Efficient processing of large-scale genomic datasets has recently become possible due to the application of ‘big data’ technologies in bioinformatics pipelines. We present SeQuiLa—a distributed, ANSI SQL-compliant solution for speedy querying and processing of genomic intervals that is available as an Apache Spark package. Proposed range join strategy is significantly (∼22×) faster than the default Apache Spark implementation and outperforms other state-of-the-art tools for genomic intervals processing.
14.11.2018
2156 - 2158
200
200
5,61