Duplicate detection consists in identifying multiple representations of a real-world object. Fast and correct duplicate detection on large amounts of data is essential in many applications, and allows avoiding errors such as sending multiple copies of a catalog to the same customer or making critica...