data hygiene

(redirected from Dirty data)
Also found in: Wikipedia.

data hygiene

The condition of data in a database. Clean data are error free or have very few errors. Dirty data have errors, including incorrect spelling and punctuation of names and addresses, redundant data in several records or simply erroneous data (not the correct amounts, names, etc.). See address cleansing.
Mentioned in ?
References in periodicals archive ?
Then, the data is cleansed, with the dirty data and cleansed data coexisting in the result set.
While the cleaning strategy research is about different cleaning methods based on different RFID physical dirty data types, in the meantime, it should aim to improve the efficiency of RFID data cleaning, with the premise of guaranteeing the cleaning effect.
Adoption of Hadoop is gathering pace, but companies are struggling to find personnel with the proper skillset to maximize its value and clean up dirty data, said Krishna Roy, Analyst at 451 Research.
Assess data conditions in the context of business processes to determine the size of the issue in terms of dirty data and its impact at each decision point or step in a business process.
Take your dirty data, take whatever data you have," he suggested.
The Coalition signaled to the Task Force the dirty data issue could become a vector of litigation, appeal, and petitions for reconsideration related to the auction.
Dirty data centres are posing a risk to efficient operations and reducing the life of hardware, according to the Data Centre Alliance (DCA).
Realistically, however, there is no way to completely avoid the problem of billing errors on the part of the utilities that can result in dirty data," says Lansdale, who already provides required consumption information on AvalonBay's New York City and Seattle communities.
Data quality is an area fraught with tough challenges - for instance, the actual damage of dirty data isn't always clearly visible.
Greenpeace said that along with Google and Facebook, Apple makes up part of 'North Carolina's dirty data triangle'.
If you think of it like a bell curve, if there is dirty data on the right of the bell curve, statistically speaking there would be another dirty data point to the left of the bell curve.
If the data entry program is not intelligent enough to trap these types of errors, the systems will insert dirty data into production data stores.