Failure detection service for large scale systems
[ 1 ] Instytut Informatyki (II), Wydział Informatyki i Zarządzania, Politechnika Poznańska | [ P ] employee
2007
paper
english
- failure detector
- large scale systems
- gossip-style protocol
- fault tolerance
- multi-agent systems
EN This paper addresses the problem of building a failure detection service for large scale distributed systems, as well as multi-agent systems. It describes the failure detector mechanism and defines the roles it plays in the system. Afterwards, the key construction problems that are fundamental in the context of building the failure detection service are presented. Finally, a sketch of general framework for implementing such a service is described. The proposed failure detection service can be used by mobile agents as a crucial component for building fault-tolerant multi-agent systems.
675 - 684