Data communications networks are the central nervous system for nearly every critical industrial network, from the transportation of goods and passengers by rail, road, air, and sea, to the supply of electricity, oil and gas, and fresh drinking water. As a result, industrial networks are relied on by every sector, and the backbone of our society's infrastructure. The reliability, resilience, and robustness of these critical networks are paramount and network failure is not an option.
In this article, we describe one of the key foundation for designing and implementing a reliable and resilient data communications network that can, for the most part, be self-sufficient and ensure the continuous availability of data across any mission-critical system.
That key foundation is preventing single point of failures.
What Is a Single Point of Failure?
Almost any situation can be affected by a single point of failure. In a data communications network, a single point of failure is any component that, if it fails, can stop the entire system from working and communicating. It could be hardware, such as a router or a switch, a power supply or a cable. It can also be the total reliance on one connection or service, such as your internet service provider, your broadband connection, or your cellular provider or 4G connection.
The single point of failure might be the person who designs, implements, and maintains the network. Everyone is capable of making mistakes, therefore having a second pair of eyes to check things is always a good idea.
Network Design - Build Network Resilience from the Ground Up
Network resilience should be the foundation upon which you build your mission critical network. When designing your mission critical network, make resilience and failover a top priority from the very beginning. A network that is built with resilience from the ground up is much easier to maintain than one that is designed without it, since adding resilience to an existing network can be time-consuming and costly. Making any changes to an existing network because of a failure can cause things to be rushed. Without proper testing, alterations to the network can have a knock-on effect with other elements of your network resulting in a situation where you are continuously applying patches to mend broken parts of the network.
Things can go wrong, it is just a fact of life. However, the best way to avoid problems is to prevent them happening in the first place. Designing, testing, and looking for those single points of failure from the beginning, testing as you go, will lay strong foundations for everything else.
Hardware Single Point of Failure
What is a hardware single point of failure?
The hardware single point of failure is a situation where, if one physical piece of equipment fails (e.g., a router or switch), and there is no failover to another device, the rest of the network or communication will be disrupted.
Carl de Bruin