"Wouldn't it be nice if Google could tell the difference between my brother and that New York restaurant reviewer with the same name...?"
Customer data is replete with information carrying the same meaning but different textual representations. Examples include foreign, abbreviated or incorrect spellings of names of people and companies. Conversely, customer information also includes information with similar textual representation and a completely different meaning.
In addition, information ages quickly. What is correct today, may be wrong tomorrow. Company and executive contact information is subject to average decay rates of 25% per year and thus requires regular updates if information is to remain current and accurate.
Powered by a combination of text mining, semantics, and statistics derived from our database of hundreds of millions of company and executive records, our Matching Engine can deliver higher match rates with better accuracy than alternative engines based simply on mathematical matching algorithms.
Digital Trowel’s Matching Engine provides the following features:
The refinement process includes twenty steps designed to progressively clean and standardize the raw data. Underlying these steps are state-of-the-art techniques to validate and standardize terms, enabling us to accurately identify companies and the people who work in them and to provide their names, addresses, titles, education, employment history and much more.
Using a combination of proprietary similarity algorithms and matching rules, we are able to accurately group disparate data regarding the same entity and to eliminate duplication. As part of the process, we use vast nickname, alias, and synonym/antonym tables, term frequencies, partial geo-locations, and other statistics to enable us to standardize biographical references, corporate data and other specific nomenclature. In addition, when possible, we validate the data against standard and/or proprietary sources such as trademark databases. This thorough process enables us to achieve high accuracy rates and deliver complete company and executive data with minimal duplications.
The Matching Engine is performance-optimized, supporting very large databases (500+ million records), and can process billions of comparisons per day, all on commodity server hardware. The matching rules are fully customizable and the workload can be distributed across multiple machines.
The Matching Engine's ability to add a time factor to collected data enhances the accuracy and relevance of its intelligence, and by linking each data piece to its original source, we can rank sources by quality and write advanced merging rules to account for differences in data quality online.
“Digital Trowel is an early leader in `Semantic Master Data Management’ technologies and has already engaged
industry power houses such as Acxiom, D&B, and Hoovers to market such technologies on the massive scale required.”
-- Aaron Zornes, Chief Research Officer at the MDM Institute and Conference Chairman at the MDM Summit