AWS Public Sector Blog

Solving medical mysteries in the AWS Cloud: Medical data-sharing innovation through the Undiagnosed Diseases Network

For many patients, the National Institutes of Health’s Undiagnosed Diseases Network (UDN) is their last hope for finding information. The UDN is a network of doctors and researchers who specialize in diagnosing the tens of millions of people who live with rare diseases for years, or even decades, without a name for their symptoms. Clinicians and researchers of the UDN work together to identify and treat these mysterious conditions.

It takes a medical village to discover and diagnose these rare diseases. To achieve their goal, the UDN is made up of a coordinating center, 12 clinical sites, a model organism screening center, a metabolomics core, a sequencing core, and a biorepository. For many years prior to the UDN, the experts at these sites were limited by antiquated data-sharing procedures. The UDN leadership realized that if they wanted to scale up and serve as many patients as possible, they needed to transform how they process, store, and share medical data—which led the UDN to the Amazon Web Services (AWS) Cloud.

Creating a seamless flow of data between the clinic and the lab

The UDN has two primary goals: to provide answers for patients and families through their medical clinics and to learn more about rare diseases through research. In the medical system more broadly, databases are typically separated—one for clinical work and one for research. But such a system sets limits on what both clinicians and researchers can learn. The experts in the UDN knew that compiling the datasets would allow them to inform one another and even help identify patients with similar symptoms earlier, so they can receive a diagnosis. “From the beginning, we knew we wanted to have all this data in one place,” said Dr. Paul Avillach, assistant professor at Harvard Medical School, who participated in the migration to an AWS Cloud-based dual-model platform, which combines research data with a patient portal. “It was a unique opportunity because we had the funding, knowledge, and willingness of UDN to make this happen.”

Through working with AWS, the UDN designed a custom platform that combined the research database with both a clinician and patient-facing portal. Today, doctors, patients, and researchers all log in to a single platform, and patient medical information is shared seamlessly between researchers and clinicians—all with the patient’s consent. Additionally, building in the AWS Cloud makes sure that all health data shared on the platform is secure and compliant, so patients can rest assured their information is safe.

“This platform is also a huge improvement on the patient side,” added Kimberly LeBlanc, genetic counselor and director of the UDN coordinating center. “Having the data together allows us to know more about our patients. So much of healthcare can benefit from bringing these data sets together. As clinicians, we can learn from our data and improve care as a result.”

Scaling up data-processing to find answers

After the success of their cloud-based platform, UDN leaders realized they could now leverage the AWS Cloud in another way—by scaling up their data-processing capabilities to analyze more genetic sequencing data, faster.

A pivotal part of the UDN’s approach is a data-processing technique called joint variant calling, in which genetic sequencing data is compared against hundreds of other samples to find similarities. When a new diagnosis is made, joint variant calling can help researchers determine if other patients in the UDN system might suffer from the same disease. “If these patients have common variants,” LeBlanc explains, “you can then make discoveries about undiagnosed conditions. Patients have been waiting years and years for this information. These aren’t easy cases. But with joint variant calling, we can compare these patients and find things in common.”

In the past, these types of analyses were conducted via on-premise architectures, which were time-consuming and inefficient. The team leveraged AWS Cloud-based tools that process bioinformatics and genomics data to analyze more than 800 samples at once – an impossibility in their previous on-premise environment. “AWS enabled us to process this data in a timely and much more scalable fashion,” said Dr. Avillach. “Right now we have 800 samples, but in a few years, it could be 5,000 or even 10,000. Moving to the AWS Cloud opens up the opportunity to scale UDN’s services to another order of magnitude.”

Changing patients’ lives with cloud-based data-sharing

For years, six-year-old Elizabeth Nagorniak failed to meet developmental milestones, which baffled doctors and troubled her mother, Mari Hanada. After more than three years without answers, Hanada turned to the UDN team, who diagnosed Elizabeth with a variant of Smith-Kingsmore syndrome, a rare condition connected to a mutation of the MTOR gene. With a diagnosis in hand, Hanada was able to seek treatment for her daughter. “She’s getting new skills weekly now,” Hanada told reporters. “It used to be annually.”

Elizabeth is just one of the hundreds of patients diagnosed by the UDN. And now that the UDN is working with AWS, the number of patients they evaluate can grow—so even more people can find a name for their condition and a path toward healing. “We are working with AWS to create the infrastructure that enables us to do better research and connect it with better clinical care,” Dr. Avillach says. “This is a ground-breaking approach. It’s truly a model of how research should be done in the future.”

Learn more about how AWS can help researchers innovate in healthcare and beyond at the AWS for research and technical computing hub, or contact the AWS Research team directly.

Read more AWS for healthcare stories:


Subscribe to the AWS Public Sector Blog newsletter to get the latest in AWS tools, solutions, and innovations from the public sector delivered to your inbox, or contact us.

Please take a few minutes to share insights regarding your experience with the AWS Public Sector Blog in this survey, and we’ll use feedback from the survey to create more content aligned with the preferences of our readers.

Isaac Kohane, MD, PhD

Isaac Kohane, MD, PhD

Isaac Kohane, MD, PhD is the inaugural Chair of the Department of Biomedical Informatics and the Marion V. Nelson Professor of Biomedical Informatics at Harvard Medical School. He develops and applies computational techniques to address disease at multiple scales—from whole healthcare systems as “living laboratories” to the functional genomics of neurodevelopment with a focus on autism. Kohane’s i2b2 project is currently deployed internationally to over 120 major academic health centers to drive discovery research in disease and pharmacovigilance (including providing evidence on drugs which ultimately contributed to “boxed warning” by the FDA). Dr. Kohane has published several hundred papers in the medical literature and authored a widely-used book on Microarrays for an Integrative Genomics. He is a member of the Institute of Medicine and the American Society for Clinical Investigation.

Kimberly LeBlanc

Kimberly LeBlanc

Kimberly LeBlanc is a genetic counselor and the Director of the Undiagnosed Diseases Network (UDN) Coordinating Center in the Department of Biomedical Informatics at Harvard Medical School. As part of the UDN Coordinating Center, Kimberly supervises the direct interactions with participants and works with clinicians and investigators to develop and implement network-wide clinical and participant engagement protocols and research projects. She received her Master’s degree in Human Genetics and Genetic Counseling from the Stanford University School of Medicine.

Ankit Malhotra

Ankit Malhotra

Ankit Malhotra is the worldwide genomics lead on the Amazon Web Services (AWS) Public Sector healthcare team. At AWS, Ankit helps healthcare and biomedical research customers in the public sector integrate genomics into their workloads, helping them accelerate and innovate using the AWS Cloud. With cross training in computer science, molecular biology, and genetics, he has over 10 years of experience as a NIH-funded computational genomic scientist.

Chris Noonan

Chris Noonan

Chris Noonan is a principal account manager at Amazon Web Services (AWS), and has been working with our enterprise healthcare and education customers since 2017. Chris is passionate about his work with the research community at HMS and their collaborators, and partnering with them to use the power of the AWS Cloud to transform the way they enable research, and improve the quality of healthcare and the lives of their patients.

Christine Tsien Silvers, MD, PhD

Christine Tsien Silvers, MD, PhD

Christine Tsien Silvers, MD, PhD, serves as Academic Medicine Business Development Executive at AWS. After training at the Massachusetts Institute of Technology, Harvard Medical School, Massachusetts General Hospital, and Brigham and Women’s Hospital, she served for ten years as Chief Medical Officer and led Clinical Informatics at two digital health startups. Board certified in both Emergency Medicine and Clinical Informatics, Chris is passionate about leveraging technology to improve health.

Heather Matson

Heather Matson

Heather Matson is a senior business development manager on the Amazon Web Services (AWS) research team. Heather helps researchers leverage the cloud to optimize their research.

Paul Avillach, MD, PhD

Paul Avillach, MD, PhD

Trained as an epidemiologist and biomedical computer scientist, Dr. Avillach’s investigation focus is in translational bioinformatics. Dr Avillach is passionate about combining clinical and genomics data across different scales and resolutions to enable new perspectives for essential biomedical questions. His research focuses on the development of novel methods and techniques for the integration of multiple heterogeneous clinic cohorts, electronic health record data, and multiple types of genomics data to encompass biological observations. Dr. Avillach architected and led the development of the PIC-SURE platform: an open source analytic framework capable of leveraging multiple sources of clinical and genomic data. It is now deployed in many institutions and projects including the NIH FISMA NHLBI BioData Catalyst platform integrating clinical and whole genome data from 250K participants and Boston Children’s Hospital Biobank with 2.9M patients. Dr. Avillach architected and led the initial development of Service Workbench on Amazon Web Services (AWS) the first open-source native cloud computing platform which provides a modular and scalable solution to the supply of computing environments for researchers.