Study the relevant material in Sections 2.2 and 2.5 and compare the system inter
ID: 3663592 • Letter: S
Question
Study the relevant material in Sections 2.2 and 2.5 and compare the system interconnects of the IBM Blue Gene/L, IBM Roadrunner, and Cray XT5 supercomputers released in the November 2009 Top 500 evaluation. Dig deeper to reveal the details of these systems. These systems may use customdesigned routers in interconnects. Some also use some commercial interconnects and components.
a. Compare the basic routers or switches used in the three system interconnects in terms of technology, chip design, routing scheme, and claimed message-passing performance.
b. Compare the topological properties, network latency, bisection bandwidth, and hardware packaging of the three system interconnects.
Explanation / Answer
IBM ROADRUNNER
The IBM Roadrunner is equipped for 1.71 petaflops and has been the world's speediest PC since June 2008 and was the primary PC ready to keep a managed 1 petaflops execution. It has 12,960 IBM PowerXCell 8i processors working at 3.2 GHz and 6,480 double center AMD Opteron processors working at 1.8 GHz, bringing about an aggregate of 130,464 processor centers. It additionally has more than 100 terabytes of RAM. The Roadrunner supercomputer is housed in 296 racks and involves 6,000 square feet (560 square meters) at the Los Alamos National Laboratory in New Mexico.A Connected Unit is 60 BladeCenter H stacked with TriBlades, that is 180 TriBlades. All TriBlades are connected with a 288-port Voltaire ISR2012 Infiniband switch. Each CU also has section to the Panasas archive structure through twelve System x3755 servers.
Joined Unit system information:
360 twofold focus Opterons with 2.88 TiB RAM.
720 PowerXCell 8i focuses with 2.88 TiB RAM.
12 System x3755 with twofold 10-GBit Ethernet each.
288-port Voltaire ISR2012 switch with 192 Infiniband 4x DDR joins (180 TriBlades and twelve I/O center points).
Roadrunner bunch
A schematic survey of the layered formation of the Roadrunner supercomputer gathering.
The last gathering is contained 18 related units, which are joined through eight additional (second-organize) Infiniband ISR2012 switches. Each CU is joined through twelve uplinks for consistently organize switch, which makes a total of 96 uplink affiliations.
General structure information:
6,480 Opteron processors with 51.8 TiB RAM (in 3,240 LS21 front lines)
12,960 Cell processors with 51.8 TiB RAM (in 6,480 QS22 front lines)
216 System x3755 I/O center points
26 288-port ISR2012 Infiniband 4x DDR switches
296 racks
2.345 MW power
Processors :
The blend layout contained twofold focus Opteron server processors manufactured by AMD using the standard AMD64 building plan. Affixed to each Opteron focus is a PowerXCell 8i processor made by IBM using Power Architecture and Cell development. As a supercomputer, the Roadrunner was seen as an Opteron bunch with Cell reviving specialists, as each center includes a Cell joined with an Opteron focus and the Opterons to each other.
Roadrunner used two exceptional models of processors. The first is the AMD Opteron 2210, running at 1.8 GHz. Opterons are used both as a part of the computational center points empowering the Cells with important data and in the system operations and correspondence center points passing data between enlisting center points and helping the chairmen running the structure. Roadrunner has a total of 6,912 Opteron processors with 6,480 used for computation and 432 for operation. The Opterons are related together by HyperTransport joins. Each Opteron has two places for a total 13,824 focuses.
The second processor is the IBM PowerXCell 8i, running at 3.2 GHz. These processors have one extensively helpful focus (PPE), and eight uncommon execution focuses (SPE) for skimming point operations. Roadrunner has a total of 12,960 PowerXCell processors, with 12,960 PPE focuses and 103,680 SPE focuses, for an entirety of 116,640 focus
BLUE GENE/L
IBM at first built up the Blue Gene group of supercomputers to reproduce biochemical procedures including proteins. The Blue Gene/L at the Lawrence Livermore National Laboratory (LLNL) was the world's speediest PC between November 2004 until 2008 when it lost its crown to another IBM venture, the Roadrunner. In its present design, the Blue Gene/L at LLNL has 131,072 IBM PowerPC processors running at 700 MHz, a sum of 49.1 terabytes of RAM, 1.89 petabytes of plate space and a hypothetical top execution of 367 teraflops. A more augmented form of the framework beforehand accomplished a crest execution of 596 teraflops. For another capable Blue Gene PC, this time in view of Blue Gene/P, look at the recently redesigned Jugene which is as of now the speediest supercomputer in Europe.
Introduce the system
Find subnet topology and topology changes, register the ways, dole out LIDs, convey the courses, arrange gadgets.
Related gadgets and elements
Gadgets: Channel Adapters (CA), Host Channel Adapters, switches, switches
Subnet supervisor (SM): finding, arranging, initiating and dealing with the subnet
A subnet administration specialists (SMA) in each gadget creates, reactions to control bundles (subnet administration parcels (SMPs)), and arranges nearby segments for subnet administration
SM trade control bundles with SMA with subnet administration interface (SMI).
Nearby Route Header (LRH): 8 bytes. Utilized for nearby directing by switches inside of an IBA subnet
Worldwide Route Header (GRH): 40 Bytes. Utilized for directing between subnets
Base Transport header (BTH): 12 Bytes, for IBA transport
Extened transport header
Solid datagram amplified transport header (RDETH): 4 bytes, only for dependable datagram
Datagram expanded transport header (DETH): 8 bytes
RDMA expanded transport header (RETH): 16 bytes
Nuclear, ACK, Atomic ACK,
Prompt DATA amplified transport header: 4 bytes, enhanced for little bundles.
Invariant CRC and variation CRC:
CRC for fields not changed and changed.
Topology:
Unpredictable
Normal: Fat tree
Join speed:
Single information rate (SDR): 2.5Gbps (X), 10Gbps (4X), and 30Gbps (12X).
Twofold information rate (DDR): 5Gbps (X), 20 Gbps (4X)
Quad information rate (QDR): 40Gbps (4X).
Cray XT5 System:
Presenting the following upset in versatile processing — the Cray XT5 supercomputer. The Cray XT5 framework joins remarkable adaptability
with remarkable sensibility, lower expense of proprietorship, and more extensive application support. What's more, as the establishment for the Cray XT5h™ — the
industry's most coordinated half and half supercomputer — it makes another worldview in superior combining so as to process industry-driving scalar
preparing ability with high-transmission capacity vector handling, reconfigurable FPGA equipment quickening and option parallel programming
dialects in a solitary framework.
The Cray XT5 bureau can suit both the Cray XT4™ and Cray XT5 process edges, making framework setups coordinated to
application requirements.The Cray XT5 framework is planned from the beginning for great adaptability, including exceedingly versatile worldwide I/O execution guaranteeing high
effectiveness for applications that require quick I/O access to huge datasets.A adaptable Linux-based working framework makes it simpler for a wide assortment of utilizations to profit by unrivaled versatility. The Linux environment
empowers streamlined porting of a wide arrangement of ISV codes.Superior vitality effectiveness and lower working expenses through creative bundling and innovations that lessen power and cooling necessities,
decreasing vitality utilization and working costs.The Cray XT5 I/O subsystem scales to meet the data transfer capacity needs of even
the most information serious applications. The I/O building design comprises of capacity
clusters associated straightforwardly to I/O hubs which live on the fast
interconnect. The Luster record framework deals with the striping of document operations
over these exhibits. This exceedingly versatile I/O construction modeling permits clients
to design the Cray XT5 with the sought transmission capacity by selecting the
fitting number of exhibits and administration.