An ML Showdown in Search of the Best Tool

An ML showdown in search of the best tool

Ever burgeoning digital data combined with impressive research has lead to a rising interest in Machine Learning or ML, which has further powered a vibrant ecosystem of technologies, frameworks, and libraries in the space. Scikit-learn sees high adoption from the tech community. The most probable reason is a powerful Python interface that allows tweaking of models across multiple parameters.

MLlib and H2O should be considered when working with Spark. Spark does come with MLlib and has a higher level wrapper called SparkML that supports the same.

In some cases, H2O offers a suitable solution and can be used alongside MLlib or added when needed. Weka’s interface makes it the easiest to use but, it isn’t popular within the tech community.

Interestingly, one should also note that Weka offers additional data mining features. PyBrain’s development has been discontinued apart from 10 small bug fixing commits over the last two and a half years.

MLlib and H2O should be considered when working with Spark. Spark does come with MLlib and has a higher level wrapper called SparkML that supports the same. In some cases, H2O offers a suitable solution and can be used alongside MLlib or added when needed.

Weka’s interface makes it the easiest to use but, it isn’t popular within the tech community. Interestingly, one should also note that Weka offers additional data mining features. PyBrain’s development has been discontinued apart from 10 small bug fixing commits over the last two and a half years.

Source: thoughtworks.com