CS 641 Lecture, Dr. Lawlor
While I was a graduate student, I did a lot of work on
supercomputers. Back in the early 1990's, "supercomputer" meant
perhaps 100 cores, typically shared memory on a fast memory bus.
In the early 2000's, "supercomputer" meant around 1000 cores, usually
distributed memory machines on a network. Today, world-class
supercomputers have around 100,000 cores.
GPUs have just started to penetrate the high-performance computing
world: of the TOP500 list of the 500 fastest supercomputers (at least
on double-precision Linpack), only 17 used either GPU or Cell
accelerators, although these include three of the top four systems in
At the Charm++ Workshop earlier this week, Satoshi Matsuoka gave an excellent keynote on the #4 machine in the world, TSUBAME 2.0, which he helped build. The hardware specs include:
Power is a serious limiting design factor in machines of this type.
- 1,408 "shoebox" HP SL390G7 nodes.
- 32 of
these fit in per rack, so this is only 42 racks. The entire
machine fits into a space about 20 feet wide by 90 feet long, about
2000 square feet.
- 2 CPU sockets per shoebox = 2,816 sockets with six-core Intel Xeon 5600s = 16,896 cores of CPU.
GPU cards per shoebox = 4,224 GPU cards with Tesla M2050s = 1.89
million threads! The same cards are used in Tianhe, the #1
machine in the world.
- Aggregate delivered performance is about a petaflop: 1015 double precision floating point operations per second.
- Network is a fiber optic QDR infiniband network. About 100 kilometers of fiber optic cable.
- Nodes are multiboot, and can run Linux or Windows HPC.
- Total power consumption when active is 1.4 megawatts; about
$140/hour of electricity. The logistics of getting this much
power into a physically small space require a high voltage
substation. Each rack has its own dedicated cold water intake and
hot water exit.
- Idle power consuption is still 0.48 megawatts. You don't want to leave a machine like that running!