Database Normalization for Developers

Practice should always be based upon a sound knowledge of theory.

                                                                              ~ Leonardo da Vinci

Rosina Bignall

Blog: http://rosinabignall.com

Twitter: @rosinabignall

 

 

JoindIn: https://joind.in/13249

 

Slides: http://goo.gl/fGnlsq

 

 

Assistance Dog Zaakir

http://tamarr.org

Database Normalization

The process of efficiently organizing data in the database

Goals of Normalization

  • Arranging data in logical groupings
  • Eliminating redundant data
  • Ensuring data dependencies make sense
  • Access and manipulate the data quickly and efficiently

Who needs it

  • Online Transactional Processing

Data vs Information

Business Rules

Relationships

Accuracy and Performance

Poor Normalization causes problems such as

  • Excessive Disk I/O
  • Poor Performance
  • Inaccurate Data
  • Incorrect or Missing Data

Definitions

Heading

A set of attributes

Definition

Definition

Tuple

A set of ordered pairs <A,v>, one pair for each attribute in the heading

Definition

Row

Essentially the implementation of a tuple

Definition

Relation

An ordered pair <H,h>, where h is the set of tuples all having heading H

Definition

Table

Essentially the implementation of a relation

Definition

Functional Dependency (FD)

An expression of the form X->Y where X and Y are subsets of the attributes then whenever two tuples have the same value for X they also have the same value for Y

Functional Dependencies

{ CITY } -> { STATUS}

First Normal Form (1NF)

All relations are by definition in 1NF

1NF simply means that each tuple in the relation contains exactly one value for each attribute

Second Normal Form (2NF)

A relation is in second normal form (2NF) if and only if for every nontrivial FD X->Y at least one of the following is true: (a) X is a superkey; (b) Y is a subkey; (c)X is not a subkey.

No repeating groups

Third Normal Form (3NF)

A relation is in third normal form (3NF) if and only if for every nontrivial FD X->Y either (a) X is a superkey; (b) Y is a subkey.

Every attribute must depend on the primary key

Boyce/Codd Normal Form (BCNF)

Every attribute must depend on the key, the whole key and nothing but the key

A relation is in Boyce/Codd normal form (BCNF) if and only if, for every nontrivial FD X->Y, X is a superkey.

Invoice Example

Procedure

  • Determine Attributes
  • Determine Functional Dependencies
  • For every set of functional dependencies with the same left side (X) add a relation with heading containing the left side (X) and all the right sides (Y).

Attributes

  • Date
  • Inv_no
  • Cust_no
  • Cust_Address
  • Item_Id
  • Description
  • Unit_price
  • Quantity
  • Amount
  • Total

Functional Dependencies (FDs)

  • Inv_no -> Date
  • Inv_no -> Cust_no
  • Inv_no -> Total
  • Cust_no -> Cust_Address
  • Item_Id -> Description
  • Item_Id -> Unit_price
  • Inv_no, Item_Id -> Quantity
  • Inv_no, Item_Id -> Amount

Procedure

  • Determine Attributes
  • Determine Functional Dependencies
  • For every set of functional dependencies with the same left side (X) add a relation with heading containing the left side (X) and all the right sides (Y).

Functional Dependencies (FDs)

  • Inv_no -> Date
  • Inv_no -> Cust_no
  • Inv_no -> Total
  • Cust_no -> Cust_Address
  • Item_Id -> Description
  • Item_Id -> Unit_price
  • Inv_no, Item_Id -> Quantity
  • Inv_no, Item_Id -> Amount

Invoice Relation

  • Inv_no -> Date
  • Inv_no -> Cust_no
  • Inv_no -> Total

Functional Dependencies (FDs)

  • Inv_no -> Date
  • Inv_no -> Cust_no
  • Inv_no -> Total
  • Cust_no -> Cust_Address
  • Item_Id -> Description
  • Item_Id -> Unit_price
  • Inv_no, Item_Id -> Quantity
  • Inv_no, Item_Id -> Amount

Customer Relation

  • Cust_no -> Cust_Address

Functional Dependencies (FDs)

  • Inv_no -> Date
  • Inv_no -> Cust_no
  • Inv_no -> Total
  • Cust_no -> Cust_Address
  • Item_Id -> Description
  • Item_Id -> Unit_price
  • Inv_no, Item_Id -> Quantity
  • Inv_no, Item_Id -> Amount

Item Relation

  • Item_Id -> Description
  • Item_Id -> Unit_price

Functional Dependencies (FDs)

  • Inv_no -> Date
  • Inv_no -> Cust_no
  • Inv_no -> Total
  • Cust_no -> Cust_Address
  • Item_Id -> Description
  • Item_Id -> Unit_price
  • Inv_no, Item_Id -> Quantity
  • Inv_no, Item_Id -> Amount

Invoice Detail Relation

  • Inv_no, Item_Id -> Quantity
  • Inv_no, Item_Id -> Amount

Resulting Relations

Another way

Invoice

  • Inv_no
  • Date
  • Cust_no
  • Cust_Address
  • Item_Id
  • Description
  • Unit_Price
  • Quantity
  • Amount
  • Total

1st Normal Form

Invoice

  • Inv_no (k)
  • Cust_no
  • Cust_Address
  • Date
  • Total

Invoice_Detail

  • Inv_no (k)
  • Item_Id (k)
  • Description
  • Unit_price
  • Quantity
  • Amount

2nd Normal Form

Invoice

  • Inv_no (k)
  • Cust_no
  • Cust_Address
  • Date
  • Total

Invoice_Detail

  • Inv_no (k)
  • Item_Id (k)
  • Quantity
  • Amount

3rd Normal Form

Item

  • Item_Id (k)
  • Description
  • Unit_price

Invoice

  • Inv_no (k)
  • Cust_no
  • Date
  • Total

Invoice_Detail

  • Inv_no (k)
  • Item_Id (k)
  • Quantity
  • Amount

Boyce/Codd Normal Form

Item

  • Item_Id (k)
  • Description
  • Unit_price

Customer

  • Cust_no (k)
  • Cust_Address

Resulting Relations

Should you Normalize?

Resources

  • Database Design & Relational Theory: Normal Forms & All That Jazz, C.J. Date
  • Database In Depth: Relational Theory for Practitioners, C.J. Date
  • Database Design for Mere Mortals: A Hands-On Guide to Relational Database Design, Michael J. Hernandez
  • An introduction to database systems, 7th ed

Rosina Bignall

Blog: http://rosinabignall.com

Twitter: @rosinabignall

 

Slides: http://goo.gl/fGnlsq

 

Feedback please!

https://joind.in/13249

Database Normalization for Developers

By Rosina Bignall

Database Normalization for Developers

A talk for Confoo 2015

  • 1,961