Depending on the amount of data to process, file generation may take longer.

If it takes too long to generate, you can limit the data by, for example, reducing the range of years.


Download BibTeX


Searching for the Origins of Life – Detecting RNA Life Signatures Using Learning Vector Quantization


[ 1 ] Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ S ] student | [ P ] employee

Scientific discipline (Law 2.0)

[2.3] Information and communication technology

Year of publication


Chapter type

chapter in monograph / paper

Publication language



EN The most plausible hypothesis for explaining the origins of life on earth is the RNA world hypothesis supported by a growing number of research results from various scientific areas. Frequently, the existence of a hypothetical species on earth is supposed, with a base RNA sequence probably dissimilar from any known genomes today. It is hard to distinguish hypothetical sequences obtained by computer simulations from biological sequences and, hence, to decide which characteristics provide biological functionality. In the present consideration biological sequences obtained from RNA-viruses are compared with computationally generated sequences (artificial life probes). The task is to discriminate the samples regarding their origin, biological or artificial. We used the learning vector quantization (LVQ) model as the respective classifier. LVQ is a dissimilarity based classifier, which has only weak requirements regarding the underlying dissimilarity measure. This gives the opportunity to investigate several dissimilarity measures regarding their discriminating behavior for this task. Particularly, we consider information theoretic dissimilarities like the normalized compression distance (NCD) and divergences based on bag-of-word (BoW) vectors generated on the base of nucleotide-codons. Additionally, the geodesic path distance is applied taking an unary coding of sequences for a representation in the underlying Grassmann-manifold. Both, BoW and GPD allow continuous updates of prototypes in the feature space and in the Grassmann-manifold, respectively, whereas NCD restricts the application of LVQ methods to median variants.

Pages (from - to)

324 - 333





Advances in Self-Organizing Maps, Learning Vector Quantization, Clustering and Data Visualization : Proceedings of the 13th International Workshop, WSOM+ 2019, Barcelona, Spain, June 26-28, 2019

Presented on

13th International Workshop on Self-Organizing Maps WSOM 2019, 26-28.06.2019, Barcelona, Spain

Ministry points / chapter


This website uses cookies to remember the authenticated session of the user. For more information, read about Cookies and Privacy Policy.