General Atomics

Apex "Sub-Clusters" Save Time and Money


General Atomics was seeking an economical high performance computing solution to run its small and intermediate simulations of toroidally-confined plasmas using GYRO, NIMROD and ORBIT-RF. Toroidally-confined plasmas (or Tokamaks) are massively-parallel simulations that require floating point and high network performance. Using external accounts at large super computing centers slowed productivity due to the amount of time waiting for compute cycles.


With budget constraints, it was foreseeable that the large high performance computing cluster needed to run large jobs was out of reach. But the small and intermediate simulations only required the system to scale to 48 processors. Working with the customer on various system topologies, Advanced Clustering Technologies devised a plan that would provide a computing solution flexible enough to handle the small and intermediate-sized jobs in a most efficient way. By splitting the Apex cluster into three sub-clusters, two 24-node clusters and one 8-node cluster, all three jobs could be run independently and efficiently.

By combining the AMD Opteron platform with an Infiniband interconnect, Advanced Clustering Technologies discovered that the Apex cluster performed better than many traditional high performance computing mainframes when benchmarking GYRO. With the exceptional floating point performance of the AMD OpteronTM and the network performance provided by the three independent Infiniband networks, immediate turn-around of small and medium-sized Tokamaks was possible.

When asked how much the high performance computing cluster had increased their productivity, General Atomics simply replied "Immensely."

Key benefits Apex clusters utilizing AMD Opteron processors with an Infiniband Interconnect:

  • Immediate turnaround of small and intermediate size jobs
  • Ability to run multiple jobs at the same time
  • Better performance and a substantial cost savings over traditional high performance super computers
  • Large decrease in time and money spent farming out small and intermediate sized jobs to external clusters


