Why we switched from Spacy to Flair to anonymize French legal cases

Why we switched from Spacy to Flair to anonymize French legal cases

  • November 9, 2020
Table of Contents

Why we switched from Spacy to Flair to anonymize French legal cases

This article details a work we did in collaboration with the French administration (DINSIC) and a French supreme court (Cour de cassation) around 2 well-known Named Entity Recognition (NER below) libraries, Spacy and Zalando Flair. Spacy accuracy was too limited for our needs, and Flair was too slow. At the end we optimized Flair up to a point where inference time has been divided by 10, making it fast enough to anonymize a large inventory of French case law.

Major ideas behind our approach are described below.

Source: towardsdatascience.com

Share :
comments powered by Disqus

Related Posts

AI Blueprints: Implementing content-based recommendations using Python

AI Blueprints: Implementing content-based recommendations using Python

In this article, we’ll have a look at how you can implement a content-based recommendation system using Python and the scikit-learn library. But before diving straight into this, it’s important to have some prerequisite knowledge of the different ways by which recommendation systems can recommend an item to users. Content-based: A content-based recommendation finds similar items to a given item by examining the item’s properties, such as its title or description, category, or dependencies on other items (for example, electronic toys require batteries).

Read More
12 open source tools for natural language processing

12 open source tools for natural language processing

It would be easy to argue that Natural Language Toolkit (NLTK) is the most full-featured tool of the ones I surveyed. It implements pretty much any component of NLP you would need, like classification, tokenization, stemming, tagging, parsing, and semantic reasoning. And there’s often more than one implementation for each, so you can choose theexact algorithm or methodology you’d like to use.

Read More