High availability (HA) is a characteristic of a system, which aims to ensure an agreed level of operational performance, usually uptime, for a higher than normal period.
Modernization has resulted in an increased reliance on these systems. For example, hospitals and data centers require high availability of their systems to perform routine daily activities. Availability refers to the ability of the user community to obtain a service or good, access the system, whether to submit new work, update or alter existing work, or collect the results of previous work. If a user cannot access the system, it is – from the users point of view – unavailable.[1] Generally, the term downtime is used to refer to periods when a system is unavailable.
Principles
There are three principles of systems design in reliability engineering which can help achieve high availability.
- Elimination of single points of failure. This means adding redundancy to the system so that failure of a component does not mean failure of the entire system.
- Reliable crossover. In redundant systems, the crossover point itself tends to become a single point of failure. Reliable systems must provide for reliable crossover.
- Detection of failures as they occur. If the two principles above are observed, then a user may never see a failure – but the maintenance activity must.