Depending on the amount of data to process, file generation may take longer.

If it takes too long to generate, you can limit the data by, for example, reducing the range of years.

Book

Download BibTeX

Title

Scalable dimensionality reduction methods for recommender systems

Authors

[ 1 ] Instytut Automatyki i Inżynierii Informatycznej, Wydział Elektryczny, Politechnika Poznańska | [ P ] employee

Year of publication

2016

Book type

scientific monograph

Publication language

english

Abstract

EN Recommendation algorithms are aimed at assisting people in dealing with the excess of information available in a system of one kind and another; nowadays, the information overload takes place in many systems, especially in the Internet. Although such algorithms have been widely adopted by both research and e-commerce bodies, the problem that their authors aim to address is still regarded as not fully solved. The research reported in the thesis is oriented on developing a technique that effectively copes with the high data sparsity problem, but at the same time is not more computationally complex than the state-of-the-art collaborative and content-based filtering methods. The vector-based model has been used to represent data since it guarantees a flexible way for storing and processing of information. To address the scalability issue, the algorithms proposed by the author are based on two-stage processing of the input data. First, the points representing the modelled data in the vector space are projected onto a randomly selected subspace of reduced-dimensionality. According to the Johnson-Lindenstrauss lemma, the size of the input matrix may be significantly reduced while still approximately preserving the distance between points in the vector space. Subsequently, the result vector space is factorized to preserve only the most salient features. Moreover, as the real-world data sets are often very sparse, the thesis focuses also on the most challenging case of extremely high collaborative data sparsity, for which the use of many widely-referenced methods is disqualified. For this reason the dimensionality reduction methods and reflective data processing are investigated, by carrying out a series of experiments, from the perspective of the ability to produce high precision recommendations and to cope with high unpredictability of the data sparsity. The results of the theoretical research have been evaluated according to a well established methodology using publicly available data sets and following scenarios corresponding to the real-world demands. As in practice it is sufficient to identify only a small set of items for each user from a vast set of choices, the evaluation is based on the so-called find-good-items task, rather than on the low-error-of-ratings prediction. This dissertation presents and evaluates a range of methods, applicable to recommender systems, that have been developed over the past several years, together with methods proposed by the author. As shown in the analysis of experimental results, the proposed solutions enable to make recommendation techniques more reliable, accurate, and applicable to even more real-world applications. Implementation of the proposed algorithm, major insights, and examples of the system applicability are also discussed. Based on presented analytical research and experimental results, the author states that vector-space recommendation techniques and dimensionality reduction methods may be combined in a way preserving the high quality of recommendations, regardless of the amount of processed heterogeneous data.

Place

Saarbrücken, Germany

Publisher name

Scholar's Press

Date of publication

2016

Number of pages

208

ISBN

978-3-659-83675-6

Keywords
EN
  • collaborative filtering
  • dimensionality reduction
  • machine learning
  • reflective random indexing
  • statistical relational learning
Comments

Opublikowana wersja rozprawy doktorskiej, którą można znaleźć pod adresem:

https://sin.put.poznan.pl/dissertations/details/d320

This website uses cookies to remember the authenticated session of the user. For more information, read about Cookies and Privacy Policy.