preloader
image

Entity Resolution

Entity resolution (ER) is the task of disambiguating records that correspond to real world entities across and within datasets. The applications of entity resolution are tremendous, particularly for public sector and federal datasets related to health, transportation, finance, law enforcement, and antiterrorism.

In more lemon terms, companies may record same client/user multiple times in their system. Entity resolution aims to detect those entities and remove duplicates for future analysis.

Project Details

Initially, we tried JedAI open-source software but compute time was too high. After some research I have found a good open-source library called dedupe https://github.com/dedupeio/dedupe. This library was using ML methods and training model based on labeled training set. After output analysis it was clear this method was doing good job and it was considered to be one solution to this problem.

  • Date

    30 Jun, 2021
  • Categories

    Machine Learning
  • Client

    Dave Imrem