In designing high assurance systems, the dependability goals are achieved through the adoption of several fault-tolerance techniques. Unfortunately, their combined effect on the system cannot be, in the general case, derived by straightforward composition of the stand-alone component's analysis, because of mutual dependence of their controlling parameters. In this paper the assessment of overall system dependability induced by such integrated fault-tolerance organization is carried out through a stochastic simulation approach. To this purpose, a few fault-tolerant multiprocessor architectures, based on the integrated usage of standard error-processing structures with a recently-proposed diagnostic mechanism, called ?-count, are selected and evaluated. The diagnostic mechanism gets its input (error signals) from the error-processing mechanism, whose behaviour is in turn influenced by the rapidity and correctness with which ?-count identifies permanently/intermittently faulty processors. The choice of the basic fault-tolerance mechanisms to adopt, as well as the reference-system architecture, has been driven by the characteristics of the envisaged target applications: mainly, stringent dependability requirements, to be traded with adequate levels of performance and cost. The analysis has focused on performability, which is an appropriate measure to evaluate whether a certain design is 'better' than another under dependability and performance point of view.

Evaluation of fault-tolerant multiprocessor systems for high assurance applications

Grandoni F;Chiaradonna S;Di Giandomenico F;
2001

Abstract

In designing high assurance systems, the dependability goals are achieved through the adoption of several fault-tolerance techniques. Unfortunately, their combined effect on the system cannot be, in the general case, derived by straightforward composition of the stand-alone component's analysis, because of mutual dependence of their controlling parameters. In this paper the assessment of overall system dependability induced by such integrated fault-tolerance organization is carried out through a stochastic simulation approach. To this purpose, a few fault-tolerant multiprocessor architectures, based on the integrated usage of standard error-processing structures with a recently-proposed diagnostic mechanism, called ?-count, are selected and evaluated. The diagnostic mechanism gets its input (error signals) from the error-processing mechanism, whose behaviour is in turn influenced by the rapidity and correctness with which ?-count identifies permanently/intermittently faulty processors. The choice of the basic fault-tolerance mechanisms to adopt, as well as the reference-system architecture, has been driven by the characteristics of the envisaged target applications: mainly, stringent dependability requirements, to be traded with adequate levels of performance and cost. The analysis has focused on performability, which is an appropriate measure to evaluate whether a certain design is 'better' than another under dependability and performance point of view.
2001
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Error analysis
Fault tolerant computer systems
Program diagnostics
Random processes
Multiprocessing systems
Dependability
Fault-tolerant architectures
Fault tolerance
File in questo prodotto:
File Dimensione Formato  
prod_43932-doc_141239.pdf

solo utenti autorizzati

Descrizione: Evaluation of fault-tolerant multiprocessor systems for high assurance applications
Tipologia: Versione Editoriale (PDF)
Dimensione 1.14 MB
Formato Adobe PDF
1.14 MB Adobe PDF   Visualizza/Apri   Richiedi una copia

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/43533
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus 2
  • ???jsp.display-item.citation.isi??? ND
social impact