Data is the lifeblood of modern business. It flows into and through every department, underpinning the decisions managers must take each day. Good decisions are derived from in-depth understanding; understanding is deduced from knowledge; knowledge is acquired through information; accurate information is generated by high quality data - consistent, current and complete. Therefore, it is important that data must be maintained in peak-condition.
However, in an organization, the daily influx of a large amount of data means that the occurrence of inaccuracies is common. Most data is generally in disarray. The utility that can be derived from this data is very limited. This is where Data Quality comes into play. Data Quality is a measure of the accuracy, uniformity and uniqueness of the data that is present in the database. High quality data should not contain any abnormalities or deviations and it must be unique, meaning there must not be duplicates of any item. This makes the data much more useful.
As data flows through several channels such as different departments in a company, there are often problems in obtaining an overview of the current performance for the company as a whole. The manager may need to view individual databases of each department before making a decision regarding the business. This method is both time consuming and resource intensive. The solution to this problem lies in Data Integration techniques.
Data Integration deals with the unification of data from several sources which helps keep abnormalities to a bare minimum and at the same time provides a consolidated view of the company regardless of the source of the data so that decisions regarding more than one department or the company as a whole can be taken much more easily and accurately.