Depending on the amount of data to process, file generation may take longer.

If it takes too long to generate, you can limit the data by, for example, reducing the range of years.

Article

Download BibTeX

Title

Machine learning for RNA 2D structure prediction benchmarked on experimental data

Authors

[ 1 ] Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ 2 ] Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ SzD ] doctoral school student | [ P ] employee

Scientific discipline (Law 2.0)

[2.3] Information and communication technology

Year of publication

2023

Published in

Briefings in Bioinformatics

Journal year: 2023 | Journal volume: vol. 24 | Journal number: no. 3

Article type

scientific article

Publication language

english

Keywords
EN
  • RNA 2D structure prediction
  • machine learning
  • deep learning
  • algorithm benchmarking
Abstract

EN Since the 1980s, dozens of computational methods have addressed the problem of predicting RNA secondary structure. Among them are those that follow standard optimization approaches and, more recently, machine learning (ML) algorithms. The former were repeatedly benchmarked on various datasets. The latter, on the other hand, have not yet undergone extensive analysis that could suggest to the user which algorithm best fits the problem to be solved. In this review, we compare 15 methods that predict the secondary structure of RNA, of which 6 are based on deep learning (DL), 3 on shallow learning (SL), and 6 control methods on non-ML approaches. We discuss the ML strategies implemented and perform three experiments in which we evaluate the prediction of (I) representatives of the RNA equivalence classes, (II) selected Rfam sequences, and (III) RNAs from new Rfam families. We show that DL-based algorithms (such as SPOT-RNA and UFold) can outperform SL and traditional methods if the data distribution is similar in the training and testing set. However, when predicting 2D structures for new RNA families, the advantage of DL is no longer clear, and its performance is inferior or equal to that of SL and non-ML methods.

Date of online publication

24.04.2023

Pages (from - to)

bbad153-1 - bbad153-9

DOI

10.1093/bib/bbad153

URL

https://academic.oup.com/bib/article/24/3/bbad153/7140288

Comments

Article Number: bbad153

License type

CC BY-NC (attribution - noncommercial)

Open Access Mode

czasopismo hybrydowe

Open Access Text Version

final published version

Date of Open Access to the publication

in press

Ministry points / journal

140

Impact Factor

9,5 [List 2022]

This website uses cookies to remember the authenticated session of the user. For more information, read about Cookies and Privacy Policy.