In this paper the consolidate identification of faults, distinguished as transient or permanent/intermittent, is approached, through the definition of a fault identification mechanism, called a-count. The goal is to allow continued use of parts being hit by transient faults, which may lead to better overall system performance if proper handling is provided. Transient faults discrimination is especially important in all those dependability-qualified applications where replacing and repairing failed components is costly, difficult or impossible at all (as on computer-guided space probes). a-count tries to balance between two conflicting requirements: the first is to keep in the system those components that have experienced just transient faults; the other is to quickly remove those affected by permanent or intermittent faults. The delay in spotting faulty components and the probability of improperly blaming correct ones are evaluated, as a-count's figures of merit. The approach is compared with some heuristics developed to deal with the same problem.
Discriminating fault rate and persistency to improve fault treatment
Chiaradonna S;Di Giandomenico F;Grandoni F
1996
Abstract
In this paper the consolidate identification of faults, distinguished as transient or permanent/intermittent, is approached, through the definition of a fault identification mechanism, called a-count. The goal is to allow continued use of parts being hit by transient faults, which may lead to better overall system performance if proper handling is provided. Transient faults discrimination is especially important in all those dependability-qualified applications where replacing and repairing failed components is costly, difficult or impossible at all (as on computer-guided space probes). a-count tries to balance between two conflicting requirements: the first is to keep in the system those components that have experienced just transient faults; the other is to quickly remove those affected by permanent or intermittent faults. The delay in spotting faulty components and the probability of improperly blaming correct ones are evaluated, as a-count's figures of merit. The approach is compared with some heuristics developed to deal with the same problem.I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.