What are some open datasets for machine learning? We at Gengo decided to create the ultimate cheat sheet for high quality datasets. These range from the vast (looking at you, Kaggle) or the highly specific (data for self-driving cars). First, a couple of pointers to keep in mind when searching for datasets. According to Dataquest: A dataset shouldn’t be messy, because you don’t want to spend a lot of time cleaning data.A dataset shouldn’t have too many rows or columns, so it’s easy to work with.The cleaner the data, the better — cleaning a large data set can be very time consuming.There should be an interesting question that can be answered with the data.