International Research Journal of Engineering and Technology (IRJET) Volume: 04 Issue: 02 | Feb -2017
www.irjet.net
e-ISSN: 2395 -0056 p-ISSN: 2395-0072
A Brief Review Of Approaches For Fault Tolerance In Distributed Systems Shwethashree A1, Swathi D V2 1Asst.Prof., 2
Ballari Institute of Technology and Management, Ballari, Karnataka, India Asst.Prof., Ballari Institute of Technology and Management, Ballari, Karnataka, India
---------------------------------------------------------------------***--------------------------------------------------------------------2. FAULT TOLERANCE APPROACHES Abstract - Distributed information processing systems have evolved over the years and are in the main stream of computing systems. The major concern in distributed systems is ensuring the predefined level of reliability and availability. These systems are prone to failure because of their high complexity. Hence fault tolerance becomes the major issue to be addressed in designing these systems. This paper provides the study of various approaches for fault tolerance. Key Words: Distributed system, Fault tolerance ,Redundancy, Replication, Dependability
1.INTRODUCTION Distributed systems consists of group of autonomous computer systems brought together to provide a set of complex functionalities or services. The computer systems are geographically distributed and are heterogeneous in nature. Distributed systems appear as one local machine to the users. These systems are advantageous as they provide scalability of software and resources dynamically. Distributed systems are required to be dependable having following characteristics.
Systems must be available, must not fail. Must fulfill timing and requirements. Systems output is required to be accurate. System must be secure System must provide safe mode operations
Thus, the dependability refers to reliability, availability, survivability and safety. To achieve these system characteristics, system must be designed to have the ability to handle faults and failures dynamically. In large and dynamic distributed system millions of computing devices are working together. Faults and failures are inevitable in such complex design. Failures can cause serious damage to the users. In systems such as online railway ticket booking systems, net banking systems failure may lead to loss of money. Hence implementation of fault tolerance techniques becomes the key factor of concern.
© 2017, IRJET
|
Impact Factor value: 5.181
|
Fault tolerance approaches can be classified into two types: Proactive and Reactive. Proactive approaches predict errors, faults and failures and replace the suspected components where as reactive approaches reduce the effect of faults by taking necessary actions. Some fault treatment policies can also be used to prevent faults from being reactivated.
3. REDUNDANCY BASED FAULT TOLERANCE Redundancy is having more than one functionally ready components of a system other than a component that actually provides the service. At two levels we can implement the redundancy: Process level and data or object level. This approach uses replication technique to create redundancy. Replication is the process of creating and maintaining multiple copies of data objects or processes.
3.1 OBJECT LEVEL REPLICATION In this approach multiple replicas of data items or objects are created and maintained at different nodes in the system[1]. The incoming request is served using one of the replicas. In this way failure of any node will not affect the functionality of the system. There are few major issues to be addressed while using replication like maintaining consistency of replicas, degree of replica and on demand replication. When user updates the information of an object, replication manager must update all replicas to ensure the consistency among replicas of same object[2]. It is important to have efficient strategy for managing consistency. In a passive strategy, a primary executes and updates state changes to all replicas where as in active strategy all replicas execute individually. Passive method avoids redundant execution and active method has comparatively low response time. These approaches have advantages and also disadvantages. Researchers have proposed various algorithms in this regard. Simple and adaptive algorithms are comparatively efficient.
ISO 9001:2008 Certified Journal
|
Page 77