data hygiene

(redirected from clean data)

data hygiene

The condition of data in a database. Clean data are error free or have very few errors. Dirty data have errors, including incorrect spelling and punctuation of names and addresses, redundant data in several records or simply erroneous data (not the correct amounts, names, etc.). See address cleansing.
Mentioned in ?
References in periodicals archive ?
Clean data is better data, which leads to better output.
When it comes to data, the biggest challenge we see is the availability of data and more specifically "clean data".
The majority of our growing crop of deep and machine learning businesses focus on finding rapidly deployable solutions, where they can tap into more readily available clean data lakes.
We recently announced a new Clean Data initiative that will reinforce the Real Estate Board of New
Data scientists evaluate and clean data, develop programme algorithms to automate data, or design rigorous data experiments.
"As electronic health records have flourished and the need for accurate, clean data has grown, organizations have started requiring that workers also know their way around databases," said Kim Murawski, senior system director of health information management for Harrisburg-based UPMC Pinnacle.
"We've shortened that down to about 20 minutes and get very clean data." The design of the sensors also takes the need of children into account.
With clear digital workflow and clean data, limitless possibilities can be opened with the current computer science technologies thanks to tools such as Artificial Intelligence, the Internet of Things, and 3D Printing."
Clean data means that your systems will run much more efficiently, and you'll also see a better ROI for every sales and marketing campaign you run.
Assessment is the gathering of clean data with which to make an informed decision.
Clean data -- which has been tagged or classified -- has more than doubled from 8 percent in 2016 to 19 percent, leading to a significant reduction in redundant, obsolete or trivial (ROT) data, which fell from 43 percent to 33 percent in recent years.
volume of clean data doubling from 2016 to make up 19% of data today, while ROT