Reward Function and Configuration Parameters in Machine Learning of a Four-Legged Walking Robot

Arkadiusz Kubacki; Marcin Adamek; Piotr Baran

doi:10.3390/app131810298

System Informacji Naukowej Politechniki Poznańskiej

PL EN

Strona główna / Publikacje / Reward Function and Configuration Parameters in Machine Learning of a Four-Legged Walking Robot

Zgłoś uwagę

Artykuł

Pobierz BibTeX

Tytuł

Reward Function and Configuration Parameters in Machine Learning of a Four-Legged Walking Robot

Autorzy

Arkadiusz Kubacki (WIM) ^{[ 1 ][ 2.9 ][ P ]}
Marcin Adamek (WIM) ^{[ 1 ][ P ]}
Piotr Baran (WIM) ^{[ 1 ][ P ]}

^{[ 1 ]} Instytut Technologii Mechanicznej, Wydział Inżynierii Mechanicznej, Politechnika Poznańska | ^{[ P ]} pracownik

Dyscyplina naukowa (Ustawa 2.0)

[2.9] Inżynieria mechaniczna

Rok publikacji

2023

Opublikowano w

Applied Sciences

Rocznik: 2023 | Tom: vol. 13 | Numer: iss. 18

Typ artykułu

artykuł naukowy

Język publikacji

angielski

Słowa kluczowe

EN

walking robot
quadruped
artificial neural network
reinforcement learning
robots
unity
ML-Agents
ML-Agents toolkit
Crawler
reward function
configuration parameters

Streszczenie

EN In contemporary times, the use of walking robots is gaining increasing popularity and is prevalent in various industries. The ability to navigate challenging terrains is one of the advantages that they have over other types of robots, but they also require more intricate control mechanisms. One way to simplify this issue is to take advantage of artificial intelligence through reinforcement learning. The reward function is one of the conditions that governs how learning takes place, determining what actions the agent is willing to take based on the collected data. Another aspect to consider is the predetermined values contained in the configuration file, which describe the course of the training. The correct tuning of them is crucial for achieving satisfactory results in the teaching process. The initial phase of the investigation involved assessing the currently prevalent forms of kinematics for walking robots. Based on this evaluation, the most suitable design was selected. Subsequently, the Unity3D development environment was configured using an ML-Agents toolkit, which supports machine learning. During the experiment, the impacts of the values defined in the configuration file and the form of the reward function on the course of training were examined. Movement algorithms were developed for various modifications for learning to use artificial neural networks.

Strony (od-do)

10298-1 - 10298-20

DOI

10.3390/app131810298

URL

https://www.mdpi.com/2076-3417/13/18/10298

Typ licencji

CC BY (uznanie autorstwa)

Tryb otwartego dostępu

otwarte czasopismo

Wersja tekstu w otwartym dostępie

ostateczna wersja opublikowana