NASA to launch 247 petabytes of data into AWS – but forgot about eye-watering cloudy egress costs before lift off

May 7, 2020

Table of Contents

NASA needs 215 more petabytes of storage by the year 2025, and expects Amazon Web Services to provide the bulk of that capacity. However, the space agency didn’t realize this would cost it plenty in cloud egress charges. As in, it will have to pay as scientists download its data.

That omission alone has left NASA’s cloud strategy pointing at the ground rather than at the heavens. The data in question will come from NASA’s Earth Science Data and Information System (ESDIS) program, which collects information from the many missions that observe our planet. NASA makes those readings available through the Earth Observing System Data and Information System (EOSDIS).

To store all the data and run EOSDIS, NASA operates a dozen Distributed Active Archive Centers (DAACs) that provide pleasing redundancy. But NASA is tired of managing all that infrastructure, so in 2019, it picked AWS to host it all, and started migrating its records to the Amazon cloud as part of a project dubbed Earthdata Cloud. The first cut-over from on-premises storage to the cloud was planned for Q1 2020, with more to follow.

The agency expects to transfer data off-premises for years to come.

Source: co.uk

Taming ElastiCache with Auto-discovery at Scale

Our backend infrastructure at Tinder relies on Redis-based caching to fulfill the requests generated by more than 2 billion uses of the Swipe® feature per day and hosts more than 30 billion matches to 190 countries globally. Most of our data operations are reads, which motivates the general data flow architecture of our backend microservices.

Testing sync at Dropbox

Executing a full rewrite of the Dropbox sync engine was pretty daunting. (Read more about our goals and how we made the decision in our previous post here.) Doing so meant taking the engine that powers Dropbox on hundreds of millions of user’s machines and swapping it out mid-flight.

How we 30x’d our Node parallelism

What’s the best way to safely increase parallelism in a production Node service? That’s a question my team needed to answer a couple of months ago. We were running 4,000 Node containers (or ‘workers’) for our bank integration service.

NASA to launch 247 petabytes of data into AWS – but forgot about eye-watering cloudy egress costs before lift off

Tags :

Share :

Related Posts

Taming ElastiCache with Auto-discovery at Scale

Testing sync at Dropbox

How we 30x’d our Node parallelism