Major supercomputers
BlueGene/Q (BGQ)
Installed August 2012
Decommissioned June 30, 2019
The BGQ was a Southern Ontario Smart Computing Innovation Platform (SOSCIP) BlueGene/Q supercomputer located at the University of Toronto’s SciNet HPC facility. The SOSCIP multi-university/industry consortium was funded by the Ontario Government and the Federal Economic Development Agency for Southern Ontario. A half-rack of BlueGene/Q (8,192 cores) was furthermore purchased by the Li Ka Shing Institute of Virology at the University of Alberta in late fall 2014 and integrated into the existing BGQ system.
The BGQ was an extremely dense and energy efficient 3rd generation Blue Gene IBM supercomputer built around a system-on-a-chip compute node that has a 16-core 1.6GHz PowerPC-based CPU (PowerPC A2) with 16GB of RAM. It consisted of 4,096 nodes with a total of 65,536 cores and up to 262,144 hardware threads. The nodes ran a lightweight Linux-like OS and were interconnected in a 5D torus. The compute nodes were accessed through a queuing system.
General Purpose Cluster (GPC)
Installed June 2009
Decommissioned April 22, 2018
The General Purpose Cluster (GPC) was our extremely large “workhorse” cluster (ranked 16th in the world at its inception, and the fastest in Canada) and was where most computations were done at SciNet. It is was IBM iDataPlex cluster based on Intel’s Nehalem architecture (one of the first in the world to make use of these chips).
The GPC consisted of 3,780 nodes (IBM iDataPlex DX360M2) with a total of 30,240 cores (Intel Xeon E5540) at 2.53GHz, with 16GB RAM per node (2GB per core) with some larger-memory nodes with up to 32GB. The nodes run Linux. Approximately one quarter of the cluster was connected with non-blocking DDR InfiniBand, while the rest of the nodes were connected with 5:1 blocked QDR InfiniBand. The compute nodes were accessed through a queuing system that allowed jobs with a maximum wall time of 48 hours and a minimum time, in the batch queue, of 15 minutes.
Tightly Coupled System (TCS)
Installed January 2009
Decommissioned September 29, 2017
The Tightly Coupled System (TCS) was a specialized cluster of “fat” (high-memory, many-core) IBM Power 575 nodes, with 4.7GHz POWER6 processors on a very fast Infiniband connection. The nodes had 32 cores (with hardware support for running 64 threads, using simultaneous multithreading (SMT) with 128GB of RAM, and run AIX; two nodes had 256GB of RAM each. It had a relatively small number of cores (~3,000) so was dedicated for jobs that required such a large memory / low latency configuration. Jobs needed to use multiples of 32 cores (per node), and were submitted to a queuing system that allowed jobs with a maximum wall time of 48 hours per job.
Smaller clusters and experimental systems
SOSCIP GPU Cluster (SGC)
Installed September 2017
Decommissioned April 30, 2020
The SOSCIP GPU Cluster (SGC) was a Southern Ontario Smart Computing Innovation Platform (SOSCIP) resource located at the University of Toronto’s SciNet HPC facility. The SOSCIP multi-university/industry consortium was funded by the Ontario Government and the Federal Economic Development Agency for Southern Ontario.
The SOSCIP GPU Cluster consisted of of 14 IBM Power 822LC “Minsky” servers, each with 2×10-core 3.25GHz POWER8 CPUs and 512GB of RAM. Similar to POWER7, the POWER8 utilizes simultaneous multithreading (SMT), but extends the design to 8 threads per core, allowing the 20 physical cores to support up to 160 threads. Each node had four Nvidia Tesla P100 GPUs, each with 16GB of RAM with Compute Capability 6.0 (Pascal), connected using NVlink.
P8
Installed September 2017
Decommissioned February 13, 2020
The P8 Test System consisted of of 4 IBM Power 822LC servers, each with 2 × 8-core 3.25GHz POWER8 CPUs and 512GB of RAM. Similar to POWER7, the POWER8 utilizes simultaneous multithreading (SMT), but extends the design to 8 threads per core allowing the 16 physical cores to support up to 128 threads. Two nodes had two Nvidia Tesla K80 GPUs with Compute Capability 3.7 (Kepler), consisting of 2×GK210 GPUs each with 12GB of RAM connected using PCI-E. Two other nodes had four Nvidia Tesla P100 GPUs, each with 16GB of RAM with Compute Capability 6.0 (Pascal), connected using NVlink.
P7
Installed May 2011
Decommissioned June 30, 2019
SciNet’s POWER7 (P7) cluster consisted of five IBM Power 755 servers, each with 4 × 8-core 3.3GHz POWER7 CPUs and 128GB of RAM. Similar to the POWER6, but running Linux, the POWER7 utilizes simultaneous multithreading (SMT), but extends the design from 2 threads per core to 4. This allows the 32 physical cores to support up to 128 threads which in many cases can lead to significant speedup.
Sandy
Installed February 2013
Decommissioned February 26, 2018
The Sandybrdige (Sandy) cluster consisted of 76 x86-64 nodes each with two octal core Intel Xeon (Sandybridge) E5-2650 CPUs at 2.0GHz, with 64GB of RAM per node. The nodes ertr interconnected with 2.6:1 blocking QDR Infiniband for MPI communications and disk I/O to the SciNet GPFS filesystems. In total this cluster contained 1216 x86-64 cores with 4,864 GB of total RAM.
Sandy was a user-contributed system acquired through a CFI LOF to a specific PI.
Gravity
Installed December 2012
Decommissioned February 26, 2018
The Gravity cluster consisted of 49 x86-64 nodes each with two hex core Intel Xeon (Sandybridge) E5-2620 CPUs at 2.0GHz, with 32GB of RAM per node. Each node had two Nvidia Tesla M2090 GPUs with CUDA Capability 2.0 (Fermi) each with 512 CUDA Cores and 6 GB of RAM. The nodes were interconnected with 3:1 blocking QDR Infiniband for MPI communications and disk I/O to the SciNet GPFS filesystems. In total this cluster contained 588 x86-64 cores with 1,568 GB of system RAM and 98 GPUs with 588 GB GPU RAM total.
Gravity was a user-contributed system acquired through a CFI LOF to a specific PI.
Knights Landing (KNL)
Installed August 2016
Decommissioned
This was a develop/test system of four x86-64 self-hosted 2nd Generation Intel Xeon Phi (Knights Landing, KNL) nodes, aka an Intel “Ninja” platform. Each node had one 64-core Intel Xeon Phi CPU 7210 at 1.30GHz with 4 threads per core. These systems were not add-on accelerators, but instead act as full-fledged processors running a regular Linux operating system. They were configured with 96GB of DDR4 system RAM along with 16GB of very fast MCDRAM. The nodes were interconnected to the rest of the clusters with QDR Infiniband and shared the regular SciNet GPFS filesystems.
Accelerator Research Cluster (ARC)
Installed June 2010
Decommissioned November 25, 2015
The Accelerator Research Cluster (ARC) was SciNet’s test bed cluster for experimenting with accelerator technology for high performance computing.
The GPU section of the ARC cluster consisted of 8 x86-64 nodes each with two quad core Intel Xeon X5550 2.67GHz CPUs with 48GB of RAM per node. Each node had two NVIDIA Tesla M2070 GPUs with Compute Capability 2.0 (Fermi), each with 448 CUDA cores at 1.15GHz and 6 GB of RAM. The nodes were interconnected with DDR Infiniband for MPI communication and Gigabit Ethernet for disk I/O to the SciNet GPFS filesystems. In total, this cluster contained 64 cores with 384 GB of system RAM and 16 GPUs with 96 GB GPU RAM total. The cluster ran Linux.