All publications sorted by (null) |
A survey on FT. Section 3, 4, 5 are noticeable, which give formal approach to FT, redundancy (introduced as must but not enough for FT). |
An approach that proposes the extension of some real time system calls in order to save a recovery point when the user invokes them. If checkpointing is frequently done, computer performance will be decreased because a great amount of temporal data will be stored. We try to reduce the performance loss in two ways: checkpoints are saved only at the end of a control cycle (the number of checkpoints is reduced) and only when a write is done (the number of system calls affected are decreased). checkpoint is done at the end of these services. |
Explains fixed interval CP together with optimization, and then an online alg. for CP placement. Performance comparison is made. |
A very short survey explaining SAFETY and LIVENESS properties for distrbuted systems |
The paper proposes basic concepts which are used to explain hardware and software architecture for fault-tolerant distributed systems. For example, the concepts of server, depends on relation, failure, failure semantics. The general issues in hardware and software architectures are presented. For software fault-tolerance, the issues related to synchronisation (close ou loose) are discussed. The paper was revised in 1993, but the references are before 1990. Some examples of industrial faul-tolerant architectures are given. \availableat\ ftp://ftp.cs.ucsd.edu/pub/team/understandingftsystems.ps.Z |
The authors show that fail-silent platforms can be realized with limited area overhead and virtually no performance penalty. Five architectures are compared: (1) a single CPU with no error correcting coding techniques (ECC); (2) a lock-step dual processor architecture with ECC for bus and memory access; (3) a shared memory loosely synchronized dual processor architecture with ECC for memory access; (4) a triple modular redundant architecture with ECC for bus and memory access; (5) a shared memory dual lock-step architecture (i.e. with 4 processors) with with ECC for bus and memory access. For low range X-by-wire automotive systems, architectures (2) or (3) are the best solutions. For high range systems, architecture (5) is the best solution. |
The paper claims that a fault-tolerant system consists of a fault-intolerant system and a set of fault-tolerance components. These FT components are introduced as Detectors (comparators, error detection codes, consistency checkers, watchdog programs, snoopers, alarms, snapshot procedures, acceptance tests, and exception conditions) and Correctors (voters, error correction codes, reset procedures, rollback recovery, rollforward recovery, constraint (re)satisfaction, exception handlers, and alternate procedures in recovery blocks). Gives formal approach to program, fault, specification FT, etc. This paper is included in kulkarni-thesis.pdf. |
This material is presented to ensure timely dissemination of scholarly and technical work. Copyright and all rights therein are retained by authors or by other copyright holders. All person copying this information are expected to adhere to the terms and constraints invoked by each author's copyright. In most cases, these works may not be reposted without the explicit permission of the copyright holder.
Les documents contenus dans ces répertoires sont rendus disponibles par les auteurs qui y ont contribué en vue d'assurer la diffusion à temps de travaux savants et techniques sur une base non-commerciale. Les droits de copie et autres droits sont gardés par les auteurs et par les détenteurs du copyright, en dépit du fait qu'ils présentent ici leurs travaux sous forme électronique. Les personnes copiant ces informations doivent adhérer aux termes et contraintes couverts par le copyright de chaque auteur. Ces travaux ne peuvent pas être rendus disponibles ailleurs sans la permission explicite du détenteur du copyright.
This document was translated from BibTEX by bibtex2html