How Amazon is solving big-data challenges with data lakes

How Amazon is solving big-data challenges with data lakes

  • January 31, 2020
Table of Contents

How Amazon is solving big-data challenges with data lakes

Back when Jeff Bezos filled orders in his garage and drove packages to the post office himself, crunching the numbers on costs, tracking inventory, and forecasting future demand was relatively simple. Fast-forward 25 years, Amazon’s retail business has more than 175 fulfillment centers (FC) worldwide with over 250,000 full-time associates shipping millions of items per day. Amazon’s worldwide financial operations team has the incredible task of tracking all of that data (think petabytes).

At Amazon’s scale, a miscalculated metric, like cost per unit, or delayed data can have a huge impact (think millions of dollars). The team is constantly looking for ways to get more accurate data, faster.

That’s why, in 2019, they had an idea: Build a data lake that can support one of the largest logistics networks on the planet. It would later become known internally as the Galaxy data lake. The Galaxy data lake was built in 2019 and now all the various teams are working on moving their data into it.

Source: allthingsdistributed.com

Tags :
Share :
comments powered by Disqus

Related Posts

AWS power outage with data loss

AWS power outage with data loss

On August 31st, 2019, an Amazon AWS US-EAST-1 datacenter in North Virginia experienced a power failure at 4:33 AM, which led to the datacenter’s backup generators to kick on. Unfortunately, these generators started failing at approximately 6:00 AM , which led to 7.5% of the EC2 instances and EBS volumes becoming unavailable. ‘1:30 PM PDT At 4:33 AM PDT one of ten data centers in one of the six Availability Zones in the US-EAST-1 Region saw a failure of utility power.

Read More
Key Conjurer: Our Policy of Least Privilege

Key Conjurer: Our Policy of Least Privilege

Hi, my name is Reza Nikoopour and I’m a security engineer on the Security team at Riot. My team is responsible for securing Riot infrastructure wherever we’re deployed – whether that means internal or external data centers or clouds. We provide cloud security guidance to the rest of Riot, and we’re responsible for Key Conjurer, our open source AWS API programmatic access solution.

Read More
VPC Traffic Mirroring – Capture & Inspect Network Traffic

VPC Traffic Mirroring – Capture & Inspect Network Traffic

Running a complex network is not an easy job. In addition to simply keeping it up and running, you need to keep an ever-watchful eye out for unusual traffic patterns or content that could signify a network intrusion, a compromised instance, or some other anomaly. VPC Traffic Mirroring Today we are launching VPC Traffic Mirroring.

Read More