NYU Langone COVID-19 Data Challenge | NYU Langone Health

Skip to Main Content
Health Tech Hub Events NYU Langone COVID-19 Data Challenge

NYU Langone COVID-19 Data Challenge

NYU Langone Health’s Medical Center Information Technology (MCIT) and the Department of Population Health invite all clinicians, clinical researchers, data scientists, biostatisticians, and students to propose research questions that can be addressed using the COVID-19 Deidentified Dataset, or to devise novel data visualizations and data science techniques that can be applied to glean insights from NYU Langone’s experience combatting the 2019 coronavirus disease (COVID-19) pandemic.

Informational Session

View the Informational Webinar that was held on October 23, 2020, to learn about the challenge and about how to participate (Kerberos ID and password required).

COVID-19 Data Challenge Collaborators

MCIT DataCore
MCIT Data Architecture and Management
NYU Health Sciences Library
Vilcek Institute of Graduate Biomedical Sciences
NYU Langone’s Department of Population Health, Division of Biostatistics
NYU School of Global Health, Department of Biostatistics
Center for Healthcare Innovation and Delivery Science, Predictive Analytics Unit

COVID-19 Data Challenge Timeline

Informational Webinar
October 23, 2020

Introduction to Technology Webinar
November 3, 2020: 3:00PM–4:00PM
November 5, 2020: 11:00AM–12:00PM

Call for Participation Closed
November 8, 2020

Teams Announced
November 13, 2020

Working Webinars
November 16, 18, and 23, 2020: 1:00PM–5:00PM

Project Submission Deadline
December 2, 2020

Final Webinar
December 9, 2020

COVID-19 Data Challenge Technology

The data challenge will use the NYU Langone enterprise data lake that is constructed with the Hadoop big data platform. The data can be accessed using the HUE tool for SQL querying, and also using analytic languages and tools such as Python, R, SAS, SPSS, and Tableau.

A Hadoop virtual desktop (Hadoop VDI) has been created that is preconfigured for Hadoop access including Kerberos and ODBC connectivity, and has Anaconda and RStudio installed. This should be used by Python, R, and Stata developers. SAS and SPSS developers will need to access the data lake from the PC that has SAS/SPSS installed.

Please join us for an Introduction to Technology webinar to learn how to access the data using the virtual desktop environments. The webinar will take place on November 3, 2020, from 3:00PM to 4:00PM and November 5, 2020, from 11:00AM to 12:00PM.

Additionally, we will be using WebEx Teams to host team-based collaborative resources.

Contact Us

Email us at coviddatachallenge@nyulangone.org with questions.