Normalization is the process that assures an efficient organization of data inside a database. The process is not very complicated but it can drastically improve the performance of a DBMS, especially in the case of a poor design. The two main goals of normalization are (1) to eliminate redundant data like what is happening when the same data is stored in more than one place, and (2) to ensure that data dependencies make sense, meaning that only related data is stored in a certain data table. Both of these goals are worth to follow as they ensure a reduce in the amount of data space used and the logical consistence of the stored data. Logical consistence means avoiding anomalies like the insertion anomalies, update anomalies, and deletion anomalies.
Figure 1 ( adapted from Aslam Memon's CS232 Database Lecture ). Normalization Process
NORMAL FORMS
The database community has developed a series of guidelines that should be followed in the database design. It is important to mention that these forms are just guidelines and nothing more, meaning that there are cases where either business rules or performance issues may lead to a design which is not conformant with the normal forms, but yields better overall performance.
. First normal form ( 1NF ): any multi valued attributes (repeating groups) have been removed, so there is a single value, which may be null at the intersection of each row and column of a table. Celko [1] observed that the structured query language SQL imposes a table to be in first normal form. The exceptions to this are vendor specific extensions like the introduction of the array data type, which are not compliant with the standard.