A method for fault-handling is presented. designed for multiprocessor systems supporting concurrent processes cooperating through message exchange. The proposal is described in reference to a specific system. i. e., the MuTEAM prototype developed in Pisa: our requirements was that no erroneous output be generated by the system under a single fault hypothesis. The fault-handling model adopted is based on backward error recovery: the set of all the application processes is partitioned into disjoint subsets (called families), which represent the atomic unit of recovery. Recovery points are established on communications among families. A single consistent recovery line is maintained, thereby avoiding the domino effect. The model does not rely on the usage of mass storage devices: rather, the recovery information pertinent to all the processes is kept in the distributed main memory of the system.

Fail-safeness in a multiprocessor system : a distributed strategy based on backward error recover

1982

Abstract

A method for fault-handling is presented. designed for multiprocessor systems supporting concurrent processes cooperating through message exchange. The proposal is described in reference to a specific system. i. e., the MuTEAM prototype developed in Pisa: our requirements was that no erroneous output be generated by the system under a single fault hypothesis. The fault-handling model adopted is based on backward error recovery: the set of all the application processes is partitioned into disjoint subsets (called families), which represent the atomic unit of recovery. Recovery points are established on communications among families. A single consistent recovery line is maintained, thereby avoiding the domino effect. The model does not rely on the usage of mass storage devices: rather, the recovery information pertinent to all the processes is kept in the distributed main memory of the system.
1982
Istituto di Scienza e Tecnologie dell'Informazione "Alessandro Faedo" - ISTI
Multiprocessor system
Distributed strategy
Backward error recovery
File in questo prodotto:
File Dimensione Formato  
prod_421205-doc_149454.pdf

accesso aperto

Descrizione: Fail-safeness in a multiprocessor system : a distributed strategy based on backward error recovery
Dimensione 2.55 MB
Formato Adobe PDF
2.55 MB Adobe PDF Visualizza/Apri

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/20.500.14243/410345
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact