How Amazon is solving big-data challenges with data lakes

How Amazon is solving big-data challenges with data lakes

  • January 31, 2020
Table of Contents

How Amazon is solving big-data challenges with data lakes

Back when Jeff Bezos filled orders in his garage and drove packages to the post office himself, crunching the numbers on costs, tracking inventory, and forecasting future demand was relatively simple. Fast-forward 25 years, Amazon’s retail business has more than 175 fulfillment centers (FC) worldwide with over 250,000 full-time associates shipping millions of items per day. Amazon’s worldwide financial operations team has the incredible task of tracking all of that data (think petabytes).

At Amazon’s scale, a miscalculated metric, like cost per unit, or delayed data can have a huge impact (think millions of dollars). The team is constantly looking for ways to get more accurate data, faster.

That’s why, in 2019, they had an idea: Build a data lake that can support one of the largest logistics networks on the planet. It would later become known internally as the Galaxy data lake. The Galaxy data lake was built in 2019 and now all the various teams are working on moving their data into it.

Source: allthingsdistributed.com

Tags :
Share :
comments powered by Disqus

Related Posts

Database Migration To Amazon Aurora

Database Migration To Amazon Aurora

In this blog post we’ll show you how we migrated a critical Postgres database with 18Tb of data from Amazon RDS (Relational Database Service) to Amazon Aurora, with minimal downtime. To do so, we’ll discuss our experience at Codacy.

Read More
A Technical Analysis of the Capital One Hack

A Technical Analysis of the Capital One Hack

The recent disclosure of yet another cloud security misconfiguration leading to the loss of sensitive personal information made the headlines this past week. This particular incident came with a bit more information from the indictment of the accused party, allowing us to piece together the revealed data and take an educated guess as to what may have transpired leading up to the loss of over 100 million credit card applications and 100 thousand social security numbers. At the root of the hack lies a common refrain: the misconfiguration of cloud infrastructure resources allowed an unauthorized user to elevate her privileges and compromise sensitive documents.

Read More
Toward a bastion-less world

Toward a bastion-less world

Using a bastion or jump server has been a common way to allow access to secure infrastructure in your virtual private cloud (VPC) and is integrated into several Quick Starts. Amazon Web Services (AWS) has recently released two new features that allow us to connect securely to private infrastructure without the need for a bastion host. This greatly improves your security and audit posture by centralizing access control and reducing inbound access.

Read More