Home >>DBMS Tutorial >DBMS Database Normalization

DBMS Database Normalization

DBMS Normalization

Functional Dependency

Functional dependency (FD) in a relation is a set of constraints between two attributes. Functional dependency means that if two tuples have the same values for the attributes A1, A2, ..., An, then they must have the same values for the attributes B1 , B2, ..., Bn.

An arrow sign (→) is defined by functional dependency, i.e., X-Y, where X functionally decides Y. Attributes on the left-hand side decide the attribute values on the right-hand side.

Armstrong's Axioms

If F is a set of functional dependencies, then the closure of F, referred to as F+, is the logically implied set of all functional dependencies. The Axioms of Armstrong are a set of rules that generate the closure of functional dependencies when implemented repeatedly.

  • Reflexive rule-If alpha is a set of attributes and beta is alpha's subset, then alpha is beta.
  • Augmentation rule −If a → b holds and y is set as an attribute, then ay → holds as well. That is, adding dependency attributes does not alter specific dependencies.
  • Transitivity rule −Same as algebra's transitive law, if a → b holds and b → c holds, then a c also holds. A → b is referred to as a functional determinant of b.

Trivial Functional Dependency

  • Trivial −If a functional dependency (FD) X → Y holds, where Y is a subset of X, then it is called a trivial FD. Trivial FDs always hold.
  • Non-trivial − If an FD X → Y holds, where Y is not a subset of X, then it is called a non-trivial FD.
  • Completely non-trivial − If an FD X → Y holds, where x intersect Y = Φ, it is said to be a completely non-trivial FD.

Normalization

It which contain anomalies, which are like a bad dream for any database administrator, if a database design is not perfect. It is next to impossible to handle a database of anomalies.

  • Update anomalies:: If data objects are scattered and are not properly linked to each other, it may lead to odd circumstances. For instance, if we try to update one data item with its copies scattered over many places, a few instances are updated correctly, while a few others are left with old values. These instances leave the database in a state that is inconsistent.
  • Deletion anomalies: We tried to delete a record, but due to unawareness, parts of it were left undeleted, the data is also saved somewhere else.
  • Insert anomalies: We have attempted to insert information into a record that does not exist at all.

Normalization is a method for removing all these anomalies and bringing a consistent condition to the database.

First Normal Form

In the description of relations (tables) itself, the First Normal Form is defined. This rule states that there must be atomic domains for all attributes in a relation. The values are indivisible units in the atomic domain.

DBMS image

We re-arrange the relation (table) as below, to convert it to First Normal Form.

DBMS image

Each attribute must contain, from its predefined domain, a single value only.

Second Normal Form

We need to consider the following before we hear about the second normal form;

Each non-prime attribute should be entirely functionally dependent on the prime key attribute if we adopt the second normal type. That is, if X→ A holds, then no proper subset Y of X should exist, for which Y → A also holds true.

  • Prime attribute-An attribute is known as a prime attribute, which is part of the candidate-key.
  • Non-prime attribute- An attribute is said to be a non-prime attribute that is not part of the prime key.

DBMS image

We see here that the prime key characteristics are Stu-ID and Proj-ID in the Student-Project relations. Non-key attributes, i.e., depending on the rule, Stu Name and Proj Name must be individually dependent on both and not on any of the prime key attributes. We find, however, that Stu-Name can be identified by Stu-ID and Proj-Name can be independently identified by Proj-ID. This is called partial dependency, which, in the Second Normal Form, is not allowed.

DBMS image

As shown in the above picture, we broke the link in two. So, no partial dependency exists.

Third Normal Form

In order for a relation to be in the Third Normal Form, it must be in the Second Normal Form and must fulfil the following

  • No non-prime attribute depends transitively on the attribute 's prime key.
  • X → A, then either − for every non-trivial functional dependency, X is either a superkey or,
  • The A attribute is prime.

DBMS image

We find that the key and only prime key attribute is Stu-ID in the above Student-Detail relation. We find that both Stu-ID and Zip itself can identify the city. Neither Zip is a superkey, nor a prime attribute is City. Additionally, Stu-ID → Zip → City, so transitive dependence exists.

We split the relation into two relation, as follows, to put this relation into a third normal form.

DBMS image

Boyce-Codd Normal Form

The Boyce-Codd Normal Form (BCNF) is a strict expansion of the Third Normal Form. BCNF declares that

X → A, X must be a super-key for every non-trivial functional dependency.

In the above image, in the Student Information relation, Stu ID is the super-key and Zip is the super-key in the ZipCodes relations. So,

Stu_ID → Stu_Name, Zip

and

Zip → City

Which confirms that both the relations are in BCNF.