Home | Platforms | Bitonic Sort | Linear Algebra | Education | Glossary |
OpenMP | MPI | CUDA | Others |
The blue gene/L computer uses a 3d torus interconnect network topology. This is a 3 dimensional grid of nodes all connected to 6 neighbors, the edge nodes wrap to connect to the corresponding edge node on the other side of the network. This design has several advantages. Primarily each node has high network bandwith to its six neighbors. It is very scalable, in order to add a new set of machines then only existing cables that need to be moved are edge cables. If a system were to use something such as a fat tree any cables between the added computer(s) and the tree-top would need to be replaced to support the higher bandwith. Computer is constructed without the use of long cables by cabling the racks with their next-to nearest neighbor. For example if there were 6 racks in a single dimension they would be cabled 1, 3, 5, 6, 4, 2, 1 . Like several other super-computers the blue gene/L has network switches integrated into the computers. Aditional features include: the use of both dynamic and deterministic routing ( packets can be made to follow a preset path or a path influenced by other traffic); virtual buffering, a strategy to prevent credit loops.
Messages consist as n x 32 bytes, where n is 1 to 8. All messages sent through this hardware conform to MPI ( message passing interface) standards. Each message includes:
General structure of the torus router. There are six interconnected receiver/sender pairs (only one pair is shown here). The buffer is split into channel to eliminate problems.
Although less common than light-weight torus networks, such as the one in the blue gene infiniband is a great network solution. Instead of having the switches integrated into the processors infiniband's switches are a seperate piece of hardware. This makes them not as light so to compensate there are more processors behind each switch.
Each node is made up of a 36 port switch. These switches have linked storage and also access to larger capacity servers. Unlike the blue-gene's torus implimentation there are two connections bewteen each neighboring node, this keeps latencies down. The topology between nodes is the same as the blue-gene's torus network. The difference being it offers a much more robust system.
fault prevention: Credit loops - A redit loop occurs when a loops of two or more tries to send data to each other. This fills each of their buffers and since every switch in the loops has it's buffers filled it creates a deadlock. This is solved by the implimention of a torus 2 QoS algorithm. Firstly each time a packet is sent it is assigned a VL ( virtual lane ), this is set so that no other packets travelling in a loop can share the VL. A dateline is drawn through all x coordinates. each time a packet is sent it creates a service level. This level differes it the packet crosses a dateline this creates virtual lanes. Each virtual lane has a seperate set of virtual buffers. And by looking at the service level of a packet the buffer is determined, this solves loop locks. the service level is indicated by 1 bit per radix ( in this case 3 bits for a total of 8 service levels) 2 VL's used. for a total of 16 SL's. In this interface these are then split into 2 8 SL parts for different QoS general computer traffic and storage traffic are seperated. Dimensional priorities are used such that x movement occurs before y and y before z.
Link failure : Handled by subnet administrator, which determines which SL (service level) and other path info to use. ( MPI queries the subnet administrator for this info) dateline crossing is ignored for rerouted packets, because they cannot create a loop because a link has been removed. The most common way the link failure is handled is the packet gets re-routed backwards to hit the target switch from the other side. Node failures are much less common but are still handled. If a switch dies then the packet is simply re-routed around it. Because all packets follows a x,y,z motion priority these rerouted packets can be identified because they fail to follow that system. Infiniband then maps SL to VL based in input/output port combination. This happens when the switch detects an illegial turn, because DOR is not followed, a new VL is assigned.