Depending on the amount of data to process, file generation may take longer.

If it takes too long to generate, you can limit the data by, for example, reducing the range of years.

Article

Download BibTeX

Title

ECHR-OD: On building an integrated open repository of legal documents for machine learning applications

Authors

[ 1 ] Instytut Informatyki, Wydział Informatyki i Telekomunikacji, Politechnika Poznańska | [ P ] employee

Scientific discipline (Law 2.0)

[2.3] Information and communication technology

Year of publication

2022

Published in

Information Systems

Journal year: 2022 | Journal volume: vol. 106

Article type

scientific article

Publication language

english

Keywords
EN
  • Open data repository
  • Legal documents repository
  • Judgment documents
  • European Court of Human Rights
  • Machine learning
  • Classification of legal documents
Abstract

EN This paper presents an exhaustive and unified repository of judgments documents, called ECHR-OD, based on the European Court of Human Rights. The need of such a repository is explained through the prism of the researcher, the data scientist, the citizen, and the legal practitioner. Contrarily to many open data repositories, the full creation process of ECHR-OD, from the collection of raw data to the feature transformation, is provided by means of a collection of fully automated and open-source scripts. It ensures reproducibility and a high level of confidence in the processed data, which is one of the most important issues in data governance nowadays. The experimental evaluation was performed to study the problem of predicting the outcome of a case, and to establish baseline results of popular machine learning algorithms. The obtained results are consistently good across the binary datasets with an accuracy comprised between 75.86% and 98.32%, having the average accuracy equals to 96.45%, which is 14pp higher than the best known result with similar methods. We achieved a F1-Score of 82% which is aligned with the recent result using BERT. We show that in a multilabel setting, the features available prior to a judgment are good predictors of the outcome, opening the road to practical applications.

Date of online publication

07.06.2021

Pages (from - to)

101822-1 - 101822-20

DOI

10.1016/j.is.2021.101822

URL

https://www.sciencedirect.com/science/article/abs/pii/S0306437921000636

Comments

Article Number: 101822

Ministry points / journal

100

Impact Factor

3,7

This website uses cookies to remember the authenticated session of the user. For more information, read about Cookies and Privacy Policy.