ZZEN9313
Webinar 2
Relational Databases
ZZEN9313 20H4
Author: Hayden Smith
Relational databases are essentially just structured tables that are related to one another.
ZZEN9313 20H4
Author: Hayden Smith
LicenseNumber | Name | Expiry |
---|---|---|
15063558 | Harry Pogson | 2020-12-20 |
22531555 | Selina Chua | 2021-06-07 |
13541235 | Yu Hou | 2020-07-07 |
Row/Record/Entry
Column/Field
Unique Column
Table/Entity/Relation
CREATE TABLE Citizens {
LicenseNumber TEXT PRIMARY_KEY,
Name TEXT,
Expiry DATE
}
Schema
Relational Databases
An important part of the most basic relational table is that columns/fields must be atomic. Atomic means its a single value with a single, known type.
The definition of atomic is not always a technical one, sometimes it's a semantic one. E.G. is "Mr Hayden Smith" atomic?
ZZEN9313 20H4
Author: Hayden Smith
Linking tables
ZZEN9313 20H4
Author: Hayden Smith
E.G. What happens if "Peter" wants to update his name to "Paul"?
Something we don't like about this table: Redundancy
Redundancy means that when we mutate a database (i.e. insert a row, update a row, or delete a row) we may have to update it in multiple places.
Linking tables
ZZEN9313 20H4
Author: Hayden Smith
We can make this less redundant by normalising the database.
Now we have removed redundancy, however, these tables are not linked properly yet - so we need to link them.
Linking tables
ZZEN9313 20H4
Author: Hayden Smith
Now we need to link these tables together somehow
Primary Key
Foreign Key
Adding a primary key to the table
Normal Forms
In this course we will explore different levels of normalised databases:
- First normal form
- Second normal form
- Third normal form
ZZEN9313 20H4
Author: Hayden Smith
BCNF, 4NF, and 5NF can be explored further in independent learning
1NF: First Normal Form
- All values are atomic
- There are no duplicate rows (i.e. you should probably include a foreign key)
ZZEN9313 20H4
Author: Hayden Smith
In this case, we've made all phone numbers atomic. We've also removed all multiple values (though an explicit primary key isn't defined)
source: https://www.youtube.com/watch?v=ABwD8IYByfk
2NF: Second Normal Form
- Is in 1NF
- Has no partial dependencies
- Partial dependency: When a non-key attribute depends on part of the primary key
ZZEN9313 20H4
Author: Hayden Smith
In this table, { Employee Id, Department Id } make up the primary key. But Office Location (a non-key attribute) depends on Department ID only
source: https://www.youtube.com/watch?v=ABwD8IYByfk
3NF: Third Normal Form
- Is in 2NF
- Has no transitive dependencies
- Transitive dependency: Indirect relationship between columns in a table that causes a functional dependency
ZZEN9313 20H4
Author: Hayden Smith
Since { MembershipID => Full Name } and { Full Name => Salutation }, we have a transitive dependency and need to separate out
source: https://www.guru99.com/database-normalization.html
Data Warehouses
- Operational databases are optimised for OLTP (Online Transactional Processing)
- Data warehouse databases are optimised for OLAP (Online Analytical Processing)
The typical process is that at some intermittent cadence (e.g. daily), data is copied from the operational database into the data warehouse.
Analysis is done on the data warehouse "offline" then. Since this database is offline and often entails processing huge amounts of data, there are two things to consider:
- Normalisation comes at a cost for large queries
- Since it's offline, there are limited insert/update/deletions, so the risk of anomalies is low
This tends to motivate us to use de-normalised
structures for data warehouses
ZZEN9313 20H4
Author: Hayden Smith
Denormalisation
Denormalisation is essentially the reverse process of normalisation: Reducing the number of tables and increasing the amount of redundancy.
For data warehouses, a common method of denormalisation is to produce star-schemas. Star schemas contain a fact table and dimension tables that essentially mean you never have to do table joins that are more than 1 separated
ZZEN9313 20H4
Author: Hayden Smith
Denormalisation
ZZEN9313 20H4
Author: Hayden Smith
Operational Database
Data Warehouse Database
Assessment 2
ZZEN9313 20H4
Author: Hayden Smith
ZZEN9313 Week 2
By cs6771
ZZEN9313 Week 2
- 537