Relational Databases
ZZEN9313 20H4
Author: Hayden Smith
Relational databases are essentially just structured tables that are related to one another.
ZZEN9313 20H4
Author: Hayden Smith
LicenseNumber | Name | Expiry |
---|---|---|
15063558 | Harry Pogson | 2020-12-20 |
22531555 | Selina Chua | 2021-06-07 |
13541235 | Yu Hou | 2020-07-07 |
Row/Record/Entry
Column/Field
Unique Column
Table/Entity/Relation
CREATE TABLE Citizens {
LicenseNumber TEXT PRIMARY_KEY,
Name TEXT,
Expiry DATE
}
Schema
An important part of the most basic relational table is that columns/fields must be atomic. Atomic means its a single value with a single, known type.
The definition of atomic is not always a technical one, sometimes it's a semantic one. E.G. is "Mr Hayden Smith" atomic?
ZZEN9313 20H4
Author: Hayden Smith
ZZEN9313 20H4
Author: Hayden Smith
E.G. What happens if "Peter" wants to update his name to "Paul"?
Something we don't like about this table: Redundancy
Redundancy means that when we mutate a database (i.e. insert a row, update a row, or delete a row) we may have to update it in multiple places.
ZZEN9313 20H4
Author: Hayden Smith
We can make this less redundant by normalising the database.
Now we have removed redundancy, however, these tables are not linked properly yet - so we need to link them.
ZZEN9313 20H4
Author: Hayden Smith
Now we need to link these tables together somehow
Primary Key
Foreign Key
Adding a primary key to the table
In this course we will explore different levels of normalised databases:
ZZEN9313 20H4
Author: Hayden Smith
BCNF, 4NF, and 5NF can be explored further in independent learning
ZZEN9313 20H4
Author: Hayden Smith
In this case, we've made all phone numbers atomic. We've also removed all multiple values (though an explicit primary key isn't defined)
source: https://www.youtube.com/watch?v=ABwD8IYByfk
ZZEN9313 20H4
Author: Hayden Smith
In this table, { Employee Id, Department Id } make up the primary key. But Office Location (a non-key attribute) depends on Department ID only
source: https://www.youtube.com/watch?v=ABwD8IYByfk
ZZEN9313 20H4
Author: Hayden Smith
Since { MembershipID => Full Name } and { Full Name => Salutation }, we have a transitive dependency and need to separate out
source: https://www.guru99.com/database-normalization.html
The typical process is that at some intermittent cadence (e.g. daily), data is copied from the operational database into the data warehouse.
Analysis is done on the data warehouse "offline" then. Since this database is offline and often entails processing huge amounts of data, there are two things to consider:
This tends to motivate us to use de-normalised
structures for data warehouses
ZZEN9313 20H4
Author: Hayden Smith
Denormalisation is essentially the reverse process of normalisation: Reducing the number of tables and increasing the amount of redundancy.
For data warehouses, a common method of denormalisation is to produce star-schemas. Star schemas contain a fact table and dimension tables that essentially mean you never have to do table joins that are more than 1 separated
ZZEN9313 20H4
Author: Hayden Smith
ZZEN9313 20H4
Author: Hayden Smith
Operational Database
Data Warehouse Database
ZZEN9313 20H4
Author: Hayden Smith