ResiliNets Strategy

From ResiliNetsWiki

Jump to: navigation, search

The ResiliNets Architecture is based on a six-step two-phase strategy D²R²+DR: defend, detect, remediate, recover, diagnose, refine, which support the four ResiliNets Axioms IUER: inevitable, understand, expect, and respond. The strategy is supported by the ResiliNets Principles that are implemented by the ResiliNets Mechanisms.

Phase 1: Real-Time Control Loop – D²R²

The first phase consists of a cycle of four steps that are performed in real time and are directly involved in network operation and service provision. Many of these cycles operate simultaneously, triggered whenever an adverse event or condition is detected.

S1. Defend

The first step in the resilience strategy is to defend against challenges and threats to normal operation. The goals are to

reduce the probability of a fault leading to a failure
reduce the impact of a adverse event or condition

A threat analysis is necessary to mount a defence/defense.

Examples of defences:

erasure coding over spatially redundant diverse paths, which permits data transfer to continue even when one of the paths is disrupted
secure signalling protocols with necessary authentication and encryption to resist traffic analysis and prevent the injection of bogus signalling messages

S2. Detect

The second step is to detect when an adverse event or condition has occurred. Detection is used to determine when defences

need to be strengthened
have failed and remediation needs to occur

S3. Remediate

The third step is to remediate the effects of the adverse event or condition to minimise the impact. The goal is to do the best possible at all levels after an adverse event and during an adverse condition. Corrective action must be taken at all levels to minimise the impact of service failure, including correct operation with graceful degradation of performance.

S4. Recover

The fourth step is to recover to original and normal operations, including control and management of the network.

Once an adverse event has ended or an adverse condition is removed, the network should recover from its remediation state to allow any degraded services return to normal performance and operation.

Examples:

deployment of replacement infrastructure after a natural disaster
restoration of normal routes after termination of a DDoS attack or the end of a flash crowd

Phase 2: Background Diagnosis and Refinement – DR

The second phase consists of two background operations that observe and modify the behaviour of the D²R² cycle: diagnosis of faults and refinement of future behaviour.

S5. Diagnose

While it is not possible to directly detect faults, a system may be able to detect resultant errors within itself, or failures may be detected outside the system. It may be possible to diagnose the fault that was the root cause. This may result in an improved system design, and may affect recovery to a better state.

S6. Refine

The final aspect of the strategy is to refine behaviour for the future based on past D²R² cycles. The goal is to learn and reflect on how the system has defended, detected, remediated, and recovered so that all of these can be improved to continuously increase the resilience of the network.

Representation

Castle_Analogy

Related Work

Several other research efforts have proposed strategies for various apsects of survivability, dependability, fault tolerance.

ANSA

fault confinement (defense)
fault detection (error/failure detection)
fault diagnosis (diagnosis)
reconfiguration (remediation)
recovery (remediate)
restart (remediate)
repair (recovery)
reintegration (recovery)

DENISE

CMU SEI

resistance (defend)
recognition (detect)
recovery (remediate, recover)
adaptation and evolution (refine)

[ Ellison-Fisher-Linger-Lipson-Longstaff-Mead-1999 ]

ResiliNets Strategy

Contents

Phase 1: Real-Time Control Loop – D²R²

S1. Defend

S2. Detect

S3. Remediate

S4. Recover

Phase 2: Background Diagnosis and Refinement – DR

S5. Diagnose

S6. Refine

Representation

Related Work

ANSA

DENISE

CMU SEI

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Toolbox

ResiliNets Strategy

Contents

Phase 1: Real-Time Control Loop – D2R2

S1. Defend

S2. Detect

S3. Remediate

S4. Recover

Phase 2: Background Diagnosis and Refinement – DR

S5. Diagnose

S6. Refine

Representation

Related Work

ANSA

DENISE

CMU SEI

Personal tools

Namespaces

Variants

Views

Actions

Search

Navigation

Toolbox

Phase 1: Real-Time Control Loop – D²R²