SeQuiLa: An elastic, fast and scalable SQL-oriented solution for processing and querying genomic intervals

Marek Wiewiórka; Anna Leśniewska; Agnieszka Szmurło; Kacper Stępień; Mateusz Borowiak; Michał J. Okoniewski; Tomasz Gambin

doi:10.1093/bioinformatics/bty940

System Informacji Naukowej Politechniki Poznańskiej

PL EN

Strona główna / Publikacje / SeQuiLa: An elastic, fast and scalable SQL-oriented solution for processing and querying genomic intervals

Zgłoś uwagę

Artykuł

Pobierz BibTeX

Tytuł

SeQuiLa: An elastic, fast and scalable SQL-oriented solution for processing and querying genomic intervals

Autorzy

Marek Wiewiórka
Anna Leśniewska (WI) ^{[ 1 ][ 2.3 ][ P ]}
Agnieszka Szmurło
Kacper Stępień (WI) ^{[ 1 ][ S ]}
Mateusz Borowiak (WI) ^{[ 1 ][ S ]}
Michał J. Okoniewski ^{[ 2 ]}
Tomasz Gambin

^{[ 1 ]} Instytut Informatyki, Wydział Informatyki, Politechnika Poznańska | ^{[ 2 ]} ETH Zurich | ^{[ P ]} pracownik | ^{[ S ]} student

Dyscyplina naukowa (Ustawa 2.0)

[2.3] Informatyka techniczna i telekomunikacja

Rok publikacji

2019

Opublikowano w

Bioinformatics

Rocznik: 2019 | Tom: vol. 35 | Numer: iss. 12

Typ artykułu

artykuł naukowy

Język publikacji

angielski

Streszczenie

EN Efficient processing of large-scale genomic datasets has recently become possible due to the application of ‘big data’ technologies in bioinformatics pipelines. We present SeQuiLa—a distributed, ANSI SQL-compliant solution for speedy querying and processing of genomic intervals that is available as an Apache Spark package. Proposed range join strategy is significantly (∼22×) faster than the default Apache Spark implementation and outperforms other state-of-the-art tools for genomic intervals processing.

Data udostępnienia online

14.11.2018

Strony (od-do)

2156 - 2158

DOI

10.1093/bioinformatics/bty940

URL

https://academic.oup.com/bioinformatics/article/35/12/2156/5182295

Punktacja Ministerstwa / czasopismo

200

Punktacja Ministerstwa / czasopismo w ewaluacji 2017-2021

200