Matplotlib—Making data visualization interesting

Data visualization is a key step to understand the dataset and draw inferences from it. While one can always closely inspect the data row by row, cell by cell, it’s often a tedious task and does not highlight the big picture. Visuals on the other hand, define data in a form that is easy to […]

Python Data Visualization 2018: Why So Many Libraries?

This post is the first in a three-part series on the state of Python data visualization tools and the trends that emerged from SciPy 2018.By James A. BednarAt a special session of SciPy 2018 in Austin, representatives of a wide range of open-source Python visualization tools shared their visions for the future of data visualization […]

An Introduction to Hashing in the Era of Machine Learning

In December 2017, researchers at Google and MIT published a provocative research paper about their efforts into “learned index structures”. The research is quite exciting, as the authors state in the abstract: Indeed the results presented by the team of Google and MIT researchers includes findings that could signal new competition for the most venerable […]

Altair: Declarative Visualization in Python

With Altair, you can spend more time understanding your data and its meaning. Altair’s API is simple, friendly and consistent and built on top of the powerful Vega-Lite visualization grammar. This elegant simplicity produces beautiful and effective visualizations with a minimal amount of code. Source: github

Exabytes in a Test Tube: The Case for DNA Data Storage

Our ability to sequence, synthesize, and edit DNA has advanced at a previously inconceivable speed. Far from being expensive and impractical, these DNA technologies are the most disruptive in all of biotechnology. It’s now possible to write custom DNA strands for pennies per base pair, at least for short strands. Two companies, GenScript Biotech Corp. […]

An Introduction to Hashing in the Era of Machine Learning

New research is an excellent opportunity to reexamine the fundamentals of a field; and it’s not often that something as fundamental (and well studied) as indexing experiences a breakthrough. This article serves as an introduction to hash tables, an abbreviated examination of what makes them fast and slow, and an intuitive view of the machine […]

15 Types of Regression you should know

Regression techniques are one of the most popular statistical techniques used for predictive modeling and data mining tasks. On average, analytics professionals know only 2-3 types of regression which are commonly used in real world. They are linear and logistic regression. But the fact is there are more than 10 types of regression algorithms designed […]

Berkeley offers its fastest-growing course – data science – online for free

The fastest-growing course in UC Berkeley’s history — Foundations of Data Science — is being offered free online this spring for the first time through the campus’s online education hub, edX. Data science is becoming important to more and more people because the world is increasingly data-driven — and not just science and tech but […]

Your Data Is Crucial to a Robotic Age. Shouldn’t You Be Paid for It?

The idea has been around for a bit. Jaron Lanier, the tech philosopher and virtual-reality pioneer who now works for Microsoft Research, proposed it in his 2013 book, “Who Owns the Future?,” as a needed corrective to an online economy mostly financed by advertisers’ covert manipulation of users’ consumer choices. Source: nytimes