At the beginning of the Delta-4 project, a design assumption was made to the effect that only physical faults were to be taken into account when providing fault-tolerant mechanisms: the possibility of design faults could therefore be neglected. In the subsequent years, recognizing that software design faults are becoming a major source of system service disruption, it was decided to study how to provide the Delta-4 architecture with specific provisions to deal with this kind of faults. Since the term software-fault tolerance may assume different meanings, let us also say that we intend here to deal with ways to tolerate design faults in software. The tolerance of hardware design faults will only be considered as a side issue. The general consideration applies that the effectiveness of fault-tolerance techniques is not usually limited to a precisely defined class of faults, and hardware design faults, software bugs and transient hardware faults often lead to similar behaviour, as discussed in section 6.4.2. It should be noticed that design faults may be present in Delta-4 hardware, operating system software, and applications. Application-level software-fault tolerance can help against all three kinds of errors (assume an operating system error that causes messages to be delivered in the wrong order: an application will often be able to recognise this, based on the expected contents of the messages), but is mainly directed against errors resulting from faults in the application itself. This chapter, after briefly recalling the main techniques presented in the literature to tolerate software design faults, focusses on the problem of applying some of these techniques in the Delta-4 architecture. Support mechanisms and structuring concepts are presented. It should be pointed out that the solutions shown below are still in the specification phase - no implementation has yet been carried out.

Software-fault tolerance

Ciompi P;Grandoni F;
1991

Abstract

At the beginning of the Delta-4 project, a design assumption was made to the effect that only physical faults were to be taken into account when providing fault-tolerant mechanisms: the possibility of design faults could therefore be neglected. In the subsequent years, recognizing that software design faults are becoming a major source of system service disruption, it was decided to study how to provide the Delta-4 architecture with specific provisions to deal with this kind of faults. Since the term software-fault tolerance may assume different meanings, let us also say that we intend here to deal with ways to tolerate design faults in software. The tolerance of hardware design faults will only be considered as a side issue. The general consideration applies that the effectiveness of fault-tolerance techniques is not usually limited to a precisely defined class of faults, and hardware design faults, software bugs and transient hardware faults often lead to similar behaviour, as discussed in section 6.4.2. It should be noticed that design faults may be present in Delta-4 hardware, operating system software, and applications. Application-level software-fault tolerance can help against all three kinds of errors (assume an operating system error that causes messages to be delivered in the wrong order: an application will often be able to recognise this, based on the expected contents of the messages), but is mainly directed against errors resulting from faults in the application itself. This chapter, after briefly recalling the main techniques presented in the literature to tolerate software design faults, focusses on the problem of applying some of these techniques in the Delta-4 architecture. Support mechanisms and structuring concepts are presented. It should be pointed out that the solutions shown below are still in the specification phase - no implementation has yet been carried out.
1991
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Software
fault tolerance
File in questo prodotto:
File Dimensione Formato  
prod_447113-doc_161036.pdf

solo utenti autorizzati

Descrizione: Software-fault tolerance
Tipologia: Versione Editoriale (PDF)
Dimensione 2.97 MB
Formato Adobe PDF
2.97 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/425205
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact