AWS has set up a large data lake employing S3 storage buckets filled with information and resources on COVID-19 to help medical researchers in the fight against the global pandemic. Making sense of many disparate data sets is critical for researchers to find ways to battle the COVID-19 pandemic.

The AWS COVID-19 data lake became generally available on April 8, providing a repository of curated data sets full of information about the coronavirus. The information includes case tracking data, hospital bed availability and research articles. Beyond just being a repository for data, AWS is connecting analysis and querying tools, including Amazon Athena for queries, Amazon QuickSight for visualization, AWS Data Exchange for subscribing to data sets and Amazon Kendra for exploring research articles.

AWS’ data lake efforts have been successful in the market for some straightforward reasons. AWS has more security certifications than any other vendor, and AWS also can ingest, store and release many different data types, from structured and columnar data to unstructured data like photos, videos, text and audio.

For the COVID-19 data lake, AWS automatically curates the data and keeps it up to date so that it is ready for analysis through several analytics and machine learning engines. AWS would normally charge for the Athena queries and additional data services that are used alongside the data but is making it easier for researchers with the AWS Diagnostic Development Initiative (DDI). With that effort, AWS is providing credits for services and technical support for diagnostic research. AWS is working with scientists and researchers to meet their evolving needs.

