We’ll, I can keep on dreaming all I want, but the harsh reality is that data comes in as many variations as there are clients.
This is not that surprising as data has its own personality primarily driven by the characteristics of the company or people managing it. (See a previous post on Data Personality to explain why.)
If your passion is to create really great data for your clients, you must take into account the uniqueness of the data you’re looking at, then decide how best to transform/clean the data depending on your objective.
But the majority of companies use automated data cleansing services, either through web services or by purchasing software.
So how does that work if every company’s data is unique? Well, doesn’t really work that well, It’s the 80/20 rule – the automatic routines are “good enough” for 80% of the people and the results are 80% good.
Automatic data cleansing gives us ok results and because it’s easy to use we accept the results.
If you want better than just OK results, then you need to look closely at the data.
We performed many tests, we compared results from automatic routines from the bigger players in the data market with results from using our methodology. We submitted the original data and then we submitted a file that uses our techniques.
At first, we didn’t expect that much of a difference, but to our surprise the results were at least 20% or more better.
So why don’t automatic routines give you the same result, after all they are supposed to clean data for all variations to the best of their ability. There are many reasons, they key ones are:
- It’s not possible to cater for all variations, there are too many variations
- The automatic routines tend to be conservative while cleansing, as they don’t want to change data incorrectly, or propose invalid duplicates or append the wrong information (customers of automated cleansing solutions often take the results as the best possible, but closer inspection will always yield an error rate.)
- For those creating automatic cleansing services, its diminishing returns to solve the last 20%, 80% of the problem is solved and the remaining 20% is too complex to resolve, that is, not cost effective to write or maintain a solution.
The end result of this automation is that many accept 80% (that is, OK or ‘good enough’ results.)
At Acuate we’ve decided NOT to accept 80%, we’ll analyse the data for its uniqueness and fix to a higher standard.
For many companies the convenience of automated routines is good-enough, but many do not know what can be achieved.
So how do we achieve 20% more: (a) by thoroughly analysing the data for patterns, (b) transforming into a format that an automatic system can reliably work and maximise it’s results and then (c) we remove errors from the automatic process and lastly (d) further improve results through post-pattern analysis.
Missing out on a further 20% improvement can make a big difference to:
- Leads Generated
- Orders Taken
- Mail Returned
- Email Conversion Rates
- Confidence of CRM Users
- And so on…
Watch our Video on How Poor Data Impacts Results (20% is a big difference.)
Summary
If you want great results, the best your data can be, with little or no errors, with no duplicates and data appended to as many records as possible, then treat the data as unique and don’t rely solely on any automated system to give you the answer.
If you would like to see how your data can be improved further or if you’re unsatisfied with your current level of data quality, then contact us on +44 (0) 844 800 8837 or fill in your details here No Obligation Data Analysis