An Introduction to Hashing in the Era of Machine Learning

An Introduction to Hashing in the Era of Machine Learning

  • May 14, 2018
Table of Contents

An Introduction to Hashing in the Era of Machine Learning

In December 2017, researchers at Google and MIT published a provocative research paper about their efforts into “learned index structures”. The research is quite exciting, as the authors state in the abstract: Indeed the results presented by the team of Google and MIT researchers includes findings that could signal new competition for the most venerable stalwarts in the world of indexing: the B-Tree and the Hash Map. The engineering community is ever abuzz about the future of machine learning; as such the research paper has made its rounds on Hacker News, Reddit, and through the halls of engineering communities worldwide.

In response to the findings of the Google/MIT collaboration, Peter Bailis and a team of Stanford researchers went back to the basics and warned us not to throw out our algorithms book just yet. Bailis’ and his team at Stanford recreated the learned index strategy, and were able to achieve similar results without any machine learning by using a classic hash table strategy called Cuckoo Hashing. So what’s all the fuss about?

Are hash maps and B-Trees destined to become aging hall-of-famers? Are machines about to rewrite the algorithms textbook? What would it really mean for the computing world if machine learning strategies really are better than the general purpose indexes we know and love?

Under what conditions will the learned indexes outperform the old standbys?

Source: bradfieldcs.com

Share :
comments powered by Disqus

Related Posts

Germany adopts first ethics standards for autonomous driving systems

Germany adopts first ethics standards for autonomous driving systems

Federal transport minister, Alexander Dobrindt, presented a report to Germany’s cabinet seeking to establish guidelines for the future programming of ethical standards into automated driving software. The report, was prepared by an automated driving ethics commission comprised of scientists and legal experts and produced 20 guidelines to be used by the automotive industry when creating automated driving systems. Shortly after its introduction, Dobrindt announced that the cabinet ratified the guidelines, making Germany the first government in the world to put such measures in place.

Read More
15 Types of Regression you should know

15 Types of Regression you should know

Regression techniques are one of the most popular statistical techniques used for predictive modeling and data mining tasks. On average, analytics professionals know only 2-3 types of regression which are commonly used in real world. They are linear and logistic regression.

Read More
Machine Learning for Text Classification Using SpaCy in Python

Machine Learning for Text Classification Using SpaCy in Python

spaCy is a popular and easy-to-use natural language processing library in Python. It provides current state-of-the-art accuracy and speed levels, and has an active open source community. However, since SpaCy is a relative new NLP library, and it’s not as widely adopted as NLTK.

Read More