"Hi, I'm SuperNorm, here to guide you through the Normalisation Process"


Relaxation Corner | Site Map | FAQ's

 
Home  >  Theory  >  Database Concepts  >  Database Types  >

Search This Site

| Home |  Database Concepts  Database Types | 1 NF | 2 NF | 3 NF 4-7 NF |

Theory

Exercises

Web Links

Downloads

Discussion

Contact Us

Glossary

Now we move on to the main purpose of this web site: the Normalisation Process

Allow Norm to guide you through a case study of a typical care hire business, following the methods and rules of Normalisation, to transform a 'flat file' into a linked data structure that can be implemented in a relational database.

Scenario: Norm's Car Rental Co.

Norm's car hire company has two main departments - one dealing with car maintenance and the other dealing with the actual car rentals. Initially we are going to focus on the relatively simple rental department.

(See 1NF page for the complete Normalisation of the much more complex Maintenance dept.)

From data analysis, the following list of RENTAL data has been generated:

RENTALS (
  HIRENO* [the unique key used to access this particular rental in the database]
  RENTAL COST

[the total cost for this rental]

  HIRE DATE [the date the car was rented]
  RETURN DATE [the date the car was returned]
  REGNO [the car registration number]
  MILES OUT [the car mileage on rental]
  MILES IN [the car mileage on return]
  EXCESS MILEAGE [the number of excess miles, if any]
  MILEAGE ALLOWANCE [the miles per day allowance, before excess charges apply]
  CUSTOMER NAME [the customer name]
  CUSTOMER ADDRESS [the customer's address]

 

CUSTOMER D-O-B ) [the customer's date of birth]
  
The Normal Forms:

Normal Forms are the series of logical steps applied to database data and tables.

There are three main levels of "Normal Form":

First Normal Form: 1NF
Second Normal Form: 2NF
Third Normal Form: 3NF

1st, 2nd & 3rd Normal Forms are sufficient for virtually all data, but there are more - 4th, 5th, 6th & 7th Normal Forms are complex, rarely used or never needed & beyond the scope of this demonstration.

First Normal Form (1NF):

The aim of First Normal Form is to remove repeating groups of data.
 
Repeating groups can be defined as: within a database table there may be many occurrences of this data item.
i.e. a customer has only one name, this is obviously not a repeating group. But, a car can have many rentals, all of which need to be stored with the car data (its rental history). The purpose of removing repeating groups is to eliminate the storing of the same data many times, thus reducing data duplication and increasing database efficiency.

Luckily, as this is a simple data set, there are no repeating groups, so the RENTALS data is already in 1NF.

Second Normal Form (2NF):

The aim of Second Normal Form is to remove non-key attributes that only depend on part of the key, to form a new table.

Non-key attributes are data items that do not form part or all of a databse table access key. For an attribute to depend on part of a key, the key must be a multi-part key.

As there is only one key (HIRENO) the data is already in 2NF.

 

Rental Dept Data Items Image

So far the data from the analysis stage, has ‘naturally’ complied with the first two Normal Form rules.

Third Normal Form (3NF):
The aim of Third Normal Form is to remove non-key attributes that are dependant on other non-key attributes.

3NF Rule 1: If the Data is in 2NF, then remove NON-KEY attributes from the master table, that are dependant on other NON-KEY attributes.

So which attributes, from RENTALS are dependant on other attributes? 

Clearly, CUSTOMER ADDRESS and CUSTOMER D-O-B are dependent on the CUSTOMER NAME; without knowing the CUSTOMER NAME, Norm cannot find the CUSTOMER ADDRESS or the CUSTOMER D-O-B in his records. In order to hire a car, customers must give their home address and be old enough to drive.

So if Norm uses CUSTOMER NAME as an index to his records, he can then find any address or d-o-b:

Using this idea, highlights one of the drawbacks of a relational database, in order to UNIQUELY identify a database record, duplication must be eliminated. What if two customers have the same name?

How many John Joneses or Anne Browns are there in this country?

3NF Rule 2: Form a new detail table, giving it a key.

To overcome this problem, it would be sensible to CREATE a NEW ATTRIBUTE, which uniquely identifies each individual customer. Theoretically, a string of numbers is the most efficient way of uniquely indexing relational database records. Hence, we will ‘invent’ a new PRIMARY KEY attribute (CUSTNO) and use it as the linking attribute to a new table.

3NF Rule 3: Retain a copy of this key, in the originating table to act as a link, or relationship.

Crucially, this new key can only be used as a link to other tables – if EVERY TABLE contains a COPY of this key. The copy of a PRIMARY KEY from one table is known as the FOREIGN KEY where it occurs in another table.        

        Normalised Data image          

 

A Final Check: Apply some 'Rules of Thumb' for Normalisation


1. The Table or Entity with the most keys is the Many part of a M:1 relationship - this is true above - the RENTAL table   has 2 keys HIRENO & CUSTNO and the CUSTOMER table has just one key; CUSTNO.

2. When data is taken from a table; transfer a from the Primary key with it - this has been done - as the RENTAL table contains a copy of the CUSTOMER table primary key; CUSTNO.

After this gentle introduction to the Normalisation, now it is the time to have a go at a 'proper' set of data.

<<<Back to Database Types

Next to First Normal Form>>>