High Performance Computing Facility (HPCF)
1. Overview (Vision, Goals, Services)
The NYULMC-HPC is a project under continuing development at the Center for Health Informatics and Bioinformatics (CHIBI) of NYULMC. The guiding vision is that of enabling step changes in the quality and range of research through the use of advanced computing hardware, algorithms and best-practices-driven integrative services.
The goals of the NYULMC-HPC are threefold:
- Technical Infrastructure: to provide a state-of-the-art local facility that combines substantial computing and storage capabilities to enable projects that require significant computational resources.
- Consulting Services: to provide integrated HPC consulting services to NYU biomedical researchers. These services function within the integrated best practices informatics (BPIC) consulting model of NYULMC Informatics Center. The services are designed to provide a “one-stop shop” solution whereby researchers receive grant preparation assistance, project design help, cost effectiveness analysis of locally and externally applicable HPC solutions, and finally project execution services.
- Utilization of External Resources: The NYULMC-HPC is open to using the best solution for the individual needs of every project and recognizes that no single solution is optimal for all projects. As such we have user privileges in NY state-provided HPC resources including: the New York Center for Computational Sciences at Stony Brook and Brookhaven, The Computational Center for Nanotechnology Innovations at RPI and the services provided by the NYS HPC Consortium (HPC2). In addition, when appropriate, we plan to redirect projects in part or in whole to DOE-sponsored leadership machines and to appropriate commercially available solutions including Cloud and Grid computing.
2. Planning Process and Phases of the NYULMC-HPC
Primary planning was initiated and executed by the NYULMC Informatics center in 2008-9 and identified local research needs projected for the next 5 years. The HPC is phased as follows:
- Phase I is designed to meet immediate next-generation sequencing assay needs as well as a relatively small number of genetic/genomic analyses and computational methods experiments.
- Phase II is designed to meet the explosively expanding needs for high-scale genomic assays and corresponding bioinformatics, the needs for extensive methods benchmarking for the best practices bioinformatics consultation service (which supports 100+ scientific projects/year), and all of the projects identified in the aforementioned user needs analysis.
- Phase III represents future expansion planning that will be needed in order to accommodate multi-institutional projects such as the New York Genome Center initiative and the NYU-Abu Dhabi research projects which are currently in planning or funding review stages.
The dedicated resources to the NYULMC-HPC represent a philosophy of growing organically to meet actual demand with attention to fiscal viability. The facility is the beneficiary of a large array of shared Informatics resources outlined below. Collectively the dedicated and the shared resources make the NYULMC-HPC a powerful new catalyst for advanced research projects.
1.5 PB (petabytes) of scalable high-bandwidth Isilon storage system. This system (deployed in November 2010) is critical for the operation of next-gen research such as projects utilizing high throughput sequencing, RNAi screen assays, microarrays, proteomics, and other high-throughput assays.
Phase I (in operation throughput 2009-10):
- FPGA sequence alignment server
- Roche 454 analysis cluster
- 64-core SUN Blade Cluster for Illumina GA sequencing analysis and general purpose computing needs
Phase II (operation started in December 2010):
- 712 processing cores of the latest generation of Intel CPUs
- 5 Tesla Nvidia GPU units with 4000 GPU cores total
- 1 very high RAM (512 GigaByte) server
- Most of the equipment housed in HCC C20 data room facility of the NYULMC
- Human resources housed in Informatics Center headquarters on 7th floor of 227 30th street (“Dry Lab Building”). Please see Contacts for more info.
- HPC architecture & management: 2 dedicated FTEs
- Storage, network, power/cooling, cybersecurity and backup services provided by MCIT department through multiple shared FTEs
- Informatics faculty with operational and scientific roles: 5 faculty
- Informatics faculty with parallel algorithm development expertise for high-dimensional complex data analytics: 2 faculty
- Informatics scientific and administrative staff (shared): 2 administrative and 4 analyst/programmers