This article was originally published by Stephanie Dascola on arc-ts.umich.edu on January 27, 2021.
On the cutting-edge of research at U-M is the Advanced Genomics Core’s Illumina NovaSeq 6000 sequencing platform. The AGC is one of the first academic core facilities to optimize this exciting and powerful instrument, that is about the size of a large laser printer.
The Advanced Genomics Core (AGC), part of the Biomedical Research Core Facilities within the Medical School Office of Research, provides high-quality, low-cost next-generation sequencing analysis for research clients on a recharge basis.
One NovaSeq run can generate as much as 4TB of raw data. So how is the AGC able to generate, process, analyze, and transfer so much data for researchers? They have partnered with Advanced Research Computing – Technology Services (ARC-TS) to leverage the speed and power of the Great Lakes High-Performance Computing Cluster.
With Great Lakes, AGC can process the data, and then store the output on other ARC-TS services: Turbo Research Storage and Data Den Research Archive, and share with clients using Globus File Transfer. All three services work together. Turbo offers the capacity and speed to match the computational performance of Great Lakes, Data Den provides an archive of raw data in case of catastrophic failure, and Globus has the performance needed for the transfer of big data.
“Thanks to Great Lakes, we were able to process dozens of large projects simultaneously, instead of being limited to just a couple at a time with our in-house system,” said Olivia Koues, Ph.D., AGC managing director.
“In calendar year 2020, the AGC delivered nearly a half petabyte of data to our research community. We rely on the speed of Turbo for storage, the robustness of Data Den for archiving, and the ease of Globus for big data file transfers. Working with ARC-TS has enabled incredible research such as making patients resilient to COVID-19. We are proudly working together to help patients.”
“Our services process more than 180,000GB of raw data per year for the AGC. That’s the same as streaming the three original Star Wars movies and the three prequels more than 6,000 times,” said Brock Palen, ARC-TS director. “We enjoy working with AGC to assist them into the next step of their big data journey.”
ARC-TS is a division of Information and Technology Services (ITS). The Advanced Genomics Core (ACG) is part of the Biomedical Research Core Facilities (BRCF) within the Medical School Office of Research.