JLAB HPC Clusters
The Scientific Computing group is working with JLab's Theory group to deploy a sequence of high performance clusters in support of Lattice Quantum ChromoDynamics (LQCD). LQCD is the numerical approach to solving QCD, the fundamental theory of quarks and gluons and their interactions. This computationally demanding suite of applications are key to understanding the experimental program of the laboratory.
Three clusters are currently deployed:
|The 2009 cluster consists of 320 nodes of dual quad core Nehalem CPUs connected via a QDR InifiniBand switched network. Each node has two processors running at 2.4 GHz, 24 GB DDR3-1333 memory and 500 GB SATA disk. In addition, each node has an InfiniBand HCA adapter that provides 40 Gb/s bandwidth.|
|The 2007 cluster consists of 396 nodes of AMD Opteron (quad-core) CPUs connected via DDR InifiniBand switched networks. Each node has two processors running at 1.9 GHz, 8 GB DDR2 memory and an 80 GB SATA disk. In addition, each node has a PCI-EXPRESS (16x) slot for an InfiniBand HCA adapter that provides 20 Gb/s bandwidth.|
|This cluster deployed in 2006 consists of 280 nodes of Intel Pentium D (dual-core) CPUs connected via InifiniBand switched networks. Each node has a single processor running at 3.0GHz with 800 MHz front side bus, 1 GB memory and 80 GB SATA disk. In addition, each node has a PCI-EXPRESS (4x) slot for an InfiniBand HCA adapter that provides 10 Gb/s bandwidth.|
Decommissioned Intel Clusters
|The 2004 cluster consisted of 384 nodes
arranged as a 6x8x2^3 mesh (torus). These nodes were interconnected using 3 dual
gigE cards plus one half of the dual gigE NIC on the motherboard (the other
half is used for file services. This 5D wiring could be configured as various
configurations of a 3D torus (4D and 5D running was possible in principle,
but was less efficient so not used). Nodes were single processor 2.8 GHz
Xeon, 800 MHz front side bus, 512 MB memory, and 36 GB disk.
This cluster achieved approximately 0.7 teraflops sustained.
Message passing on this novel architecture was done using an application optimized library, QMP, for which implementations will also target other custom LQCD machines. The lowest levels of the communications stack were implemented using a VIA driver, and for multiple link transfers VIA data rates approaching 500 MB/sec/node were achieved.
|The 2003 deployed cluster of 256 nodes was configured
as either a 2x2x4x8. Nodes were interconnected using 3 dual
gigE cards (one per dimension, view
of wiring) and one on-board link, an approach which delivered high
aggregate bandwidth while avoiding the expense of a high performance switch.
Nodes were single processor 2.67 GHz
Pentium 4 Xeon
with 256 MByte of memory. The cluster achieved 0.4 TeraFlops on LQCD applications,
Additional information can be found in an Intel case study (pdf file).
|The oldest cluster, now decommissioned, was installed in 2002. The cluster contained 128 nodes of single processor
2.0 GHz Xeons connected by a
It achieved 1/8 TeraFlops of performance (Linpack),
and contained 65 GBytes of total physical memory, and
delivered 270MB/s simultaneous node-to-node
aggregate network bandwidth. All nodes ran RedHat Linux Release
7.3 with kernel version
Additional configuration information.