Trinity (supercomputer)

Summary

Trinity (or ATS-1) is a United States supercomputer built by the National Nuclear Security Administration (NNSA) for the Advanced Simulation and Computing Program (ASC).[2] The aim of the ASC program is to simulate, test, and maintain the United States nuclear stockpile.

Trinity
OperatorsNational Nuclear Security Administration
LocationLos Alamos National Laboratory
CostUS$174M[1]
PurposePrimarily utilized to perform milestone weapons calculations
Websitelanl.gov/projects/trinity/

History edit

  • December 2013, The National Energy Research Scientific Computing Center (NERSC) and The Alliance for Computing at Extreme Scale (ACES) releases a joint RFP with technical requirements for Trinity.[3]
  • July 2014, Cray announces that they were awarded the $174 Million contract by the National Nuclear Security Administration to provide a next generation supercomputer to Los Alamos National Laboratory.[4]
  • June 2015, Haswell Partition installation begins.[5]
  • November 2015, Trinity appears on the Supercomputing Top500 list at #6.[6]
  • June 2016, Knights Landing Partition installation begins.[7]
  • November 2016, Trinity falls to #10 on the Top500 list.[8]
  • July 2017, The Haswell and KNL partitions are merged.[9]
  • November 2018, Trinity regains #6 spot on the Top500 list.[10]
  • December 2020, Trinity falls to #13 on the Top500 list.[11]

Trinity technical specifications edit

Trinity High-Level Technical Specifications [12]
Operational Lifetime 2015 to 2020
Architecture Cray XC40
Memory Capacity 2.07 PiB
Peak Performance 41.5 PF/s
Number of Compute Nodes 19,420
Parallel File System Capacity 78 PB (69 PiB)
Burst Buffer Capacity 3.7 PB
Footprint 4606 sq ft
power requirement 8.6 MW

Compute Tier edit

Trinity was built in 2 stages. The first stage incorporated the Intel Xeon Haswell processor while the second stage added a significant performance increase using the Intel Xeon Phi Knights Landing Processor. There are 301,952 Haswell and 678,912 Knights Landing processors in the combined system, yielding a total peak performance of over 40 PF/s (petaflops)[13]

Storage Tiers edit

There are 5 primary storage tiers; Memory, Burst Buffer, Parallel File System, Campaign Storage, and Archive.[14]

Memory edit

2 PiB of DDR4 DRAM provide physical memory for the machine. Each processor also has DRAM built on to the tile, providing additional memory capacity. The data in this tier is highly transient and is typically in residence for only a few seconds, being overwritten continuously. [15]

Burst Buffer edit

Cray supplies the three hundred XC40 Data Warp blades that each contain 2 Burst Buffer nodes and 4 SSD drives. There is a total of 3.78 PB of storage in this tier, capable of moving data at a rate of up to 2 TB/s. In this tier, data is typically resident for a few hours, with data being overwritten in approximately that same time frame. [16]

Parallel File System edit

Trinity uses a Sonexion based Lustre file system with a total capacity of 78 PB. Throughput on this tier is about 1.8 TB/s (1.6 TiB/s). It is used to stage data in preparation for HPC operations. Data residence in this tier is typically several weeks.

Campaign Storage edit

The MarFS Filesystem fits into the Campaign Storage tier and combines properties of POSIX and Object storage models. The capacity of this tier is growing at a rate of about 30 PB/year, with a current capacity of over 100 PB. In testing, LANL scientists were able to create 968 billion files in a single directory at a rate of 835 million file creations per second. This storage is designed to be more robust than typical object storage, while sacrificing some of the end user functionality that you would get from a POSIX system. Performance of this tier is between 100-300 GB/s of throughput. Data residence in this tier is longer term, typically lasting several months.

Key Design goals edit

  • Transparency
  • Data protection
  • Recoverability
  • Ease of administration

MarFS is an open source filesystem and can be downloaded here: https://github.com/mar-file-system/marfs

Archive edit

The final layer of storage is the Archive. This is a HPSS tape file system that holds approximately 100 PB of data.

 
Infographic on Trinity's file storage system. Click to enlarge.

See also edit

References edit

  1. ^ "Cray Awarded $174 Million Supercomputer Contract From the National Nuclear Security Administration". Retrieved 2014-08-24.
  2. ^ Morgan, Timothy Prickett (1 October 2020). "With "Crossroads" Supercomputer, HPE Notches Another DOE Win". The Next Platform. Retrieved 5 November 2020.
  3. ^ "Trinity / NERSC-8 RFP". Archived from the original on 2018-11-26. Retrieved 2018-11-26.
  4. ^ http://investors.cray.com/phoenix.zhtml?c=98390&p=irol-newsArticle&ID=1946457&highlight=
  5. ^ "Trinity Supercomputer's Haswell and KNL Partitions Are Merged". 19 July 2017.
  6. ^ "Novermber [sic] 2015 | TOP500".
  7. ^ "LANL Adds Capacity to Trinity Supercomputer for Stockpile Stewardship". 24 July 2017.
  8. ^ "November 2016 | TOP500".
  9. ^ "Trinity Supercomputer's Haswell and KNL Partitions Are Merged". 19 July 2017.
  10. ^ "November 2018 | TOP500".
  11. ^ "NNSA supercomputers recognized worldwide for speed and performance". Energy.gov. Retrieved 2023-11-13.
  12. ^ "Technical Specifications".
  13. ^ "Trinity Supercomputer's Haswell and KNL Partitions Are Merged". 19 July 2017.
  14. ^ https://www.snia.org/sites/default/files/SDC/2018/presentations/General_Session/Grider_Gary_Storage_Lessons_from_HPC_A_Multi-Decadal_Struggle.pdf[bare URL PDF]
  15. ^ https://www.snia.org/sites/default/files/SDC/2018/presentations/General_Session/Grider_Gary_Storage_Lessons_from_HPC_A_Multi-Decadal_Struggle.pdf[bare URL PDF]
  16. ^ https://www.snia.org/sites/default/files/SDC/2018/presentations/General_Session/Grider_Gary_Storage_Lessons_from_HPC_A_Multi-Decadal_Struggle.pdf[bare URL PDF]