12 open source tools for natural language processing

12 open source tools for natural language processing

  • April 8, 2019
Table of Contents

12 open source tools for natural language processing

It would be easy to argue that Natural Language Toolkit (NLTK) is the most full-featured tool of the ones I surveyed. It implements pretty much any component of NLP you would need, like classification, tokenization, stemming, tagging, parsing, and semantic reasoning. And there’s often more than one implementation for each, so you can choose theexact algorithm or methodology you’d like to use.

It also supports many languages. However, it represents all data in the form of strings, which is fine for simple constructsbut makes it hard to use some advanced functionality. The documentation is also quite dense, but there is a lot of it, as well as a great book.

The library is also a bit slow compared to other tools. Overall, this is a great toolkit for experimentation, exploration, and applications that need a particular combination of algorithms. SpaCy is probably the main competitor to NLTK.

It is faster in most cases, but it only has a single implementation for each NLP component. Also, it represents everything as an object rather than a string, which simplifies the interface for building applications. This also helps it integrate with many other frameworks and data science tools, so you can do more once you have a better understanding of your text data.

However, SpaCy doesn’t support as many languages as NLTK. It does have a simple interface with a simplified set of choices and great documentation, as well as multiple neural models for various components of language processing and analysis. Overall, this is a great tool for new applications that need to be performant in production and don’t require a specific algorithm.

Source: opensource.com

Tags :
Share :
comments powered by Disqus

Related Posts

Machine Learning for Text Classification Using SpaCy in Python

Machine Learning for Text Classification Using SpaCy in Python

spaCy is a popular and easy-to-use natural language processing library in Python. It provides current state-of-the-art accuracy and speed levels, and has an active open source community. However, since SpaCy is a relative new NLP library, and it’s not as widely adopted as NLTK.

Read More
Baidu shows off its instant pocket translator

Baidu shows off its instant pocket translator

The Chinese Internet giant has made significant strides improving machine language translation since 2015, using an advanced form of artificial intelligence known as deep learning, said Hua Wu, the company’s chief scientist focused on natural-language processing. On stage, the Internet-connected device was able to almost instantly translate a short conversation between Wu and senior editor Will Knight. It easily rendered Knight’s questions including ’Where can I buy this device?’

Read More