Abdullah Fathi
CREATE DATABASE dvdrental;
C:\>cd C:\Program Files\PostgreSQL\11\bin
pg_restore -U postgres -d dvdrental C:\dvdrental\dvdrental.tar
C:\>cd C:\Program Files\PostgreSQL\11\bin
Is a container of other objects such as tables, views, function, and indexes.
Used to store the data.
A special feature of PostgreSQL is table inheritence. A table (child table) can inherit from another table (parent table), so when you query the data from the child table, the data from the parent table is also showing up
Is a logical container of tables and other objects inside a database. Schema is a part of the ANSI-SQL standard.
Tablespace is a location on disk where PostgreSQL stores data files containing database objects e.g. indexes and tables.
Virtual table that is used to simplify complex queries and to apply security for a set of records
Is a block reusable SQL code that returns a scalar value of a list record.
Operators are symbolic functions. PostgreSQL allows you to define custom operators
Enable you to convert one data type into another data type
Used to manage auto-increment columns that defined in a table as a serial column
Wrap other objects including types, casts, indexes, functions, etc. into a single unit.
SELECT
column_1,
column_2,
...
FROM
table_name;
SELECT
first_name
FROM
customer;
SELECT
first_name,
last_name,
email
FROM
customer;
SELECT
*
FROM
customer;
SELECT
first_name || ' ' || last_name AS full_name,
email
FROM
customer;
We used concatenation operator ||
SELECT 5 * 3 AS result;
We skip the FROM clause because the statement does not refer to any table
SELECT
column_1,
column_2
FROM
tbl_name
ORDER BY
column_1 ASC,
column_2 DESC;
1. Query sorts customers by the first name in ascending order:
SELECT
first_name,
last_name
FROM
customer
ORDER BY
first_name ASC;
2. Query sorts customers by the last name in descending order:
SELECT
first_name,
last_name
FROM
customer
ORDER BY
last_name DESC;
3. Query sorts customers by the first name in ascending order and then last name in descending order:
SELECT
first_name,
last_name
FROM
customer
ORDER BY
first_name ASC,
last_name DESC;
SELECT
DISTINCT column_1
FROM
table_name;
If you specify multiple columns, the DISTINCT clause will evaluate the duplicate based on the combination of values of these columns
SELECT
DISTINCT column_1, column_2
FROM
table_name;
DISTINCT ON (expression)
to keep the "first" row of each group of duplicates:
SELECT
DISTINCT ON (column_1) column_alias,
column_2
FROM
table_name
ORDER BY
column_1,
column_2;
CREATE TABLE t1 (
id serial NOT NULL PRIMARY KEY,
bcolor VARCHAR,
fcolor VARCHAR
);
INSERT INTO t1 (bcolor, fcolor)
VALUES
('red', 'red'),
('red', 'red'),
('red', NULL),
(NULL, 'red'),
('red', 'green'),
('red', 'blue'),
('green', 'red'),
('green', 'blue'),
('green', 'green'),
('blue', 'red'),
('blue', 'green'),
('blue', 'blue');
SELECT
id,
bcolor,
fcolor
FROM
t1;
SELECT
DISTINCT bcolor
FROM
t1
ORDER BY
bcolor;
SELECT
DISTINCT bcolor,
fcolor
FROM
t1
ORDER BY
bcolor,
fcolor;
SELECT
DISTINCT ON
(bcolor) bcolor,
fcolor
FROM
t1
ORDER BY
bcolor,
fcolor;
SELECT select_list
FROM table_name
WHERE condition;
Operator | Description |
---|---|
= | Equal |
> | Greater than |
< | Less than |
>= | Greater than or equal |
<= | Less than or equal |
<> or != | Not equal |
AND | Logical operator AND |
OR | Logical operator OR |
SELECT
last_name,
first_name
FROM
customer
WHERE
first_name = 'Jamie';
SELECT
last_name,
first_name
FROM
customer
WHERE
first_name = 'Jamie'
AND last_name = 'Rice';
SELECT
first_name,
last_name
FROM
customer
WHERE
last_name = 'Rodriguez' OR
first_name = 'Adam';
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name IN ('Ann','Anne','Annie');
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name LIKE 'Ann%'
The % is called a wildcard that matches any string. The 'Ann%' pattern matches any string that starts with 'Ann'
SELECT
first_name,
LENGTH(first_name) name_length
FROM
customer
WHERE
first_name LIKE 'A%' AND
LENGTH(first_name) BETWEEN 3 AND 5
ORDER BY
name_length;
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name LIKE 'Bra%' AND
last_name <> 'Motley';
SELECT
*
FROM
table_name
LIMIT n;
The statement returns n rows generated by the query. If n is zero, the query returns an empty set.
SELECT
*
FROM
table
LIMIT n OFFSET m;
Skip rows before returning n rows
SELECT
film_id,
title,
release_year
FROM
film
ORDER BY
film_id
LIMIT 5;
SELECT
film_id,
title,
release_year
FROM
film
ORDER BY
film_id
LIMIT 4 OFFSET 3;
SELECT
film_id,
title,
rental_rate
FROM
film
ORDER BY
rental_rate DESC
LIMIT 10;
value IN (value1,value2,...)
The expression returns true if the value matches any value in the list
value IN (SELECT value FROM tbl_name);
The statement inside the parentheses is called a subquery, which is a query nested inside another query
SELECT
customer_id,
rental_id,
return_date
FROM
rental
WHERE
customer_id IN (1, 2)
ORDER BY
return_date DESC;
SELECT
customer_id,
rental_id,
return_date
FROM
rental
WHERE
customer_id NOT IN (1, 2);
SELECT
first_name,
last_name
FROM
customer
WHERE
customer_id IN (
SELECT
customer_id
FROM
rental
WHERE
CAST (return_date AS DATE) = '2005-05-27'
);
value BETWEEN low AND high;
If the value is greater than or equal to the low value and less than or equal to the high value, the expression returns true, otherwise, it returns false
value NOT BETWEEN low AND high;
If you want to check if a value is out of a range, you combine the NOT operator with the BETWEEN operator
SELECT
customer_id,
payment_id,
amount
FROM
payment
WHERE
amount BETWEEN 8
AND 9;
SELECT
customer_id,
payment_id,
amount
FROM
payment
WHERE
amount NOT BETWEEN 8
AND 9;
SELECT
customer_id,
payment_id,
amount,
payment_date
FROM
payment
WHERE
payment_date BETWEEN '2007-02-07'
AND '2007-02-15';
string LIKE pattern
The expression returns true if the string matches the pattern, otherwise it returns false
string NOT LIKE pattern
The expression returns true if LIKE returns true and vice versa
SELECT
'foo' LIKE 'foo', -- true
'foo' LIKE 'f%', -- true
'foo' LIKE '_o_', -- true
'bar' LIKE 'b_'; -- false, because it find matches string begin with letter b and followed by any single character
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name LIKE '%er%'
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name LIKE '_her%';
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name NOT LIKE 'Jen%';
SELECT
first_name,
last_name
FROM
customer
WHERE
first_name ILIKE 'BAR%';
PostgreSQL also provides some operators that act like the LIKE, NOT LIKE, ILIKE and NOT ILIKE operator as shown below:
CREATE TABLE contacts(
id INT GENERATED BY DEFAULT AS IDENTITY,
first_name VARCHAR(50) NOT NULL,
last_name VARCHAR(50) NOT NULL,
email VARCHAR(255) NOT NULL,
phone VARCHAR(15),
PRIMARY KEY (id)
);
INSERT INTO contacts(first_name, last_name, email, phone)
VALUES
('John','Doe','john.doe@example.com',NULL),
('Lily','Bush','lily.bush@example.com','(408-234-2764)');
SELECT
id,
first_name,
last_name,
email,
phone
FROM
contacts
WHERE
phone = NULL;
The statement returns no row because the expression phone = NULL always return false. Even though there is NULL in the phone column, the expression NULL = NULL returns false. This is because NULL is not equal to any value even itself
value IS NULL
The expression returns true if the value is NULL or false if it is not
SELECT
id,
first_name,
last_name,
email,
phone
FROM
contacts
WHERE
phone IS NULL;
SELECT
id,
first_name,
last_name,
email,
phone
FROM
contacts
WHERE
phone IS NOT NULL;
SELECT column_name AS alias_name
FROM table;
the column_name is assigned an alias as alias_name. The AS keyword is optional so you can skip it
SELECT
first_name || ' ' || last_name AS full_name
FROM
customer
ORDER BY
full_name;
SELECT
column_list
FROM
table_name AS alias_name;
SELECT
A.pka,
A.c1,
B.pkb,
B.c2
FROM
A
INNER JOIN B ON A .pka = B.fka;
The expression returns true if the value is NULL or false if it is not
Each customer may have zero or many payments. Each payment belongs to one and only one customer. The customer_id field establishes the link between two tables
SELECT
customer.customer_id,
first_name,
last_name,
email,
amount,
payment_date
FROM
customer
INNER JOIN payment ON payment.customer_id = customer.customer_id
#WHERE
#customer.customer_id = 2;
ORDER BY
customer.customer_id;
SELECT
customer.customer_id,
customer.first_name customer_first_name,
customer.last_name customer_last_name,
customer.email,
staff.first_name staff_first_name,
staff.last_name staff_last_name,
amount,
payment_date
FROM
customer
INNER JOIN payment ON payment.customer_id = customer.customer_id
INNER JOIN staff ON payment.staff_id = staff.staff_id;
SELECT
A.pka,
A.c1,
B.pkb,
B.c2
FROM
A
LEFT JOIN B ON A .pka = B.fka;
Each row in the film table may have zero or many rows in the inventory table. Each row in the inventory table has one and only one row in the film table
SELECT
film.film_id,
film.title,
inventory_id
FROM
film
LEFT JOIN inventory ON inventory.film_id = film.film_id;
SELECT
film.film_id,
film.title,
inventory_id
FROM
film
LEFT JOIN inventory ON inventory.film_id = film.film_id
WHERE
inventory.film_id IS NULL;
SELECT column_list
FROM A a1
INNER JOIN A b1 ON join_predicate;
table A is joined to itself using the INNER JOIN clause. you can also use the LEFT JOIN or RIGHT JOIN clause
SELECT
f1.title,
f2.title,
f1. length
FROM
film f1
INNER JOIN film f2 ON f1.film_id <> f2.film_id
AND f1. length = f2. length;
SELECT * FROM A
FULL [OUTER] JOIN B on A.id = B.id;
The OUTER keyword is optional
CREATE TABLE
IF NOT EXISTS departments (
department_id serial PRIMARY KEY,
department_name VARCHAR (255) NOT NULL
);
CREATE TABLE
IF NOT EXISTS employees (
employee_id serial PRIMARY KEY,
employee_name VARCHAR (255),
department_id INTEGER
);
Each department has zero or many employees and each employee belongs to zero or one department
INSERT INTO departments (department_name)
VALUES
('Sales'),
('Marketing'),
('HR'),
('IT'),
('Production');
INSERT INTO employees (
employee_name,
department_id
)
VALUES
('Bette Nicholson', 1),
('Christian Gable', 1),
('Joe Swank', 2),
('Fred Costner', 3),
('Sandra Kilmer', 4),
('Julia Mcqueen', NULL);
# SELECT * FROM departments;
department_id | department_name
---------------+-----------------
1 | Sales
2 | Marketing
3 | HR
4 | IT
5 | Production
(5 rows)
# SELECT * FROM employees;
employee_id | employee_name | department_id
-------------+-----------------+---------------
1 | Bette Nicholson | 1
2 | Christian Gable | 1
3 | Joe Swank | 2
4 | Fred Costner | 3
5 | Sandra Kilmer | 4
6 | Julia Mcqueen |
(6 rows)
SELECT
employee_name,
department_name
FROM
employees e
FULL OUTER JOIN departments d ON d.department_id = e.department_id;
5. result set includes every employee who belongs to a department and every department which have an employee. In addition, it includes every employee who does not belong to a department and every department that doesn’t have an employee
employee_name | department_name
-----------------+-----------------
Bette Nicholson | Sales
Christian Gable | Sales
Joe Swank | Marketing
Fred Costner | HR
Sandra Kilmer | IT
Julia Mcqueen | NULL
NULL | Production
(7 rows)
6. Find the department that does have any employee
SELECT
employee_name,
department_name
FROM
employees e
FULL OUTER JOIN departments d ON d.department_id = e.department_id
WHERE
employee_name IS NULL;
employee_name | department_name
---------------+-----------------
NULL | Production
(1 row)
6. Find the employee who does not belong to any department
SELECT
employee_name,
department_name
FROM
employees e
FULL OUTER JOIN departments d ON d.department_id = e.department_id
WHERE
department_name IS NULL;
employee_name | department_name
---------------+-----------------
Julia Mcqueen | NULL
(1 row)
CROSS JOIN clause allows you to produce the Cartesian Product of rows in two or more tables
Different from the other JOIN operators such as LEFT JOIN or INNER JOIN, the CROSS JOIN does not have any matching condition in the join clause
SELECT *
FROM T1
CROSS JOIN T2;
is also equivalent to:
SELECT *
FROM T1, T2;
Text
SELECT *
FROM T1
INNER JOIN T2 ON TRUE;
Use the INNER JOIN clause with the condition evaluates to true to perform the cross join
We will perform the CROSS JOIN of two tables T1 and T2. For every row from T1 and T2 i.e., a cartesian product, the result set will contain a row that consists of all columns in the T1 table followed by all columns in the T2 table. If T1 has N rows, T2 has M rows, the result set will have N x M rows
CREATE TABLE T1 (label CHAR(1) PRIMARY KEY);
CREATE TABLE T2 (score INT PRIMARY KEY);
INSERT INTO T1 (label)
VALUES
('A'),
('B');
INSERT INTO T2 (score)
VALUES
(1),
(2),
(3);
SELECT
*
FROM
T1
CROSS JOIN T2;
label | score
-------+-------
A | 1
B | 1
A | 2
B | 2
A | 3
B | 3
(6 rows)
3. Illustration Result of the CROSS JOIN operator when we join the T1 table with the T2
GROUP BY clause divides the rows returned from the SELECT statement into groups.
For each group, you can apply an aggregate function e.g., SUM() to calculate the sum of items or COUNT() to get the number of items in the groups
SELECT column_1, aggregate_function(column_2)
FROM tbl_name
GROUP BY column_1;
SELECT
customer_id
FROM
payment
GROUP BY
customer_id;
In this case, the GROUP BY acts like the DISTINCT clause that removes the duplicate rows from the result set
SELECT
customer_id,
SUM (amount)
FROM
payment
GROUP BY
customer_id;
SELECT
customer_id,
SUM (amount)
FROM
payment
GROUP BY
customer_id
ORDER BY
SUM (amount) DESC;
Count the number of transactions each staff has been processing and group the payments table based on staff id and use the COUNT() function to get the number of transactions
SELECT
staff_id,
COUNT (payment_id)
FROM
payment
GROUP BY
staff_id;
HAVING clause often use in conjunction with the GROUP BY clause to filter group rows that do not satisfy a specified condition.
HAVING clause sets the condition for group rows created by the GROUP BY clause after the GROUP BY clause applies while the WHERE clause sets the condition for individual rows before GROUP BY clause applies.
This is the main difference between the HAVING and WHERE clauses
SELECT
column_1,
aggregate_function (column_2)
FROM
tbl_name
GROUP BY
column_1
HAVING
condition;
Gets the total amount of each customer by using the GROUP BY clause and selects the only customer who has been spending more than 200
SELECT
customer_id,
SUM (amount)
FROM
payment
GROUP BY
customer_id
HAVING
SUM (amount) > 200;
SELECT
store_id,
COUNT (customer_id)
FROM
customer
GROUP BY
store_id
SELECT
store_id,
COUNT (customer_id)
FROM
customer
GROUP BY
store_id
HAVING
COUNT (customer_id) > 300;
SELECT
column_1,
column_2
FROM
tbl_name_1
UNION
SELECT
column_1,
column_2
FROM
tbl_name_2;
The following are rules applied to the queries:
Let’s take a look at the following tables:
sales2007q1 data:
sales2007q2 data:
SELECT *
FROM
sales2007q1
UNION
SELECT *
FROM
sales2007q2;
There are five rows in the combined result set because the UNION operator removes one duplicate row
SELECT *
FROM
sales2007q1
UNION ALL
SELECT *
FROM
sales2007q2
ORDER BY
name ASC,
amount DESC;
JOIN combine columns
UNION combine rows
CREATE TABLE sales (
brand VARCHAR NOT NULL,
segment VARCHAR NOT NULL,
quantity INT NOT NULL,
PRIMARY KEY (brand, segment)
);
INSERT INTO sales (brand, segment, quantity)
VALUES
('ABC', 'Premium', 100),
('ABC', 'Basic', 200),
('XYZ', 'Premium', 100),
('XYZ', 'Basic', 300);
The sales table stores the number of products sold by brand and segment
SELECT
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
brand,
segment;
SELECT
brand,
SUM (quantity)
FROM
sales
GROUP BY
brand;
SELECT
segment,
SUM (quantity)
FROM
sales
GROUP BY
segment;
SELECT
SUM (quantity)
FROM
sales;
6. UNION ALL requires all result sets to have the same number of columns with compatible data types, you need to adjust the queries by adding NULL to the selection list of each
SELECT
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
brand,
segment
UNION ALL
SELECT
brand,
NULL,
SUM (quantity)
FROM
sales
GROUP BY
brand
UNION ALL
SELECT
NULL,
segment,
SUM (quantity)
FROM
sales
GROUP BY
segment
UNION ALL
SELECT
NULL,
NULL,
SUM (quantity)
FROM
sales;
SELECT
c1,
c2,
aggregate_function(c3)
FROM
table_name
GROUP BY
GROUPING SETS (
(c1, c2),
(c1),
(c2),
()
);
SELECT
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
GROUPING SETS (
(brand, segment),
(brand),
(segment),
()
);
GROUPING function accepts a name of a column and returns bit 0 if the column is the member of the current grouping set and 1 otherwise
SELECT
GROUPING(brand) grouping_brand,
GROUPING(segment) grouping_segement,
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
GROUPING SETS (
(brand, segment),
(brand),
(segment),
()
)
ORDER BY
brand,
segment;
SELECT
c1,
c2,
c3,
aggregate (c4)
FROM
table_name
GROUP BY
CUBE (c1, c2, c3);
Specify the columns (dimensions or dimension columns) which you want to analyze and aggregation function expressions
CUBE(c1,c2,c3)
GROUPING SETS (
(c1,c2,c3),
(c1,c2),
(c1,c3),
(c2,c3),
(c1),
(c2),
(c3),
()
)
In general, if the number of columns specified in the CUBE is n, then you will have 2^n combinations
PostgreSQL allows you to perform a partial cube to reduce the number of aggregates calculated. The following shows the syntax:
SELECT
c1,
c2,
c3,
aggregate (c4)
FROM
table_name
GROUP BY
c1,
CUBE (c1, c2);
SELECT
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
CUBE (brand, segment)
ORDER BY
brand,
segment;
SELECT
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
brand,
CUBE (segment)
ORDER BY
brand,
segment;
ROLLUP is a subclause of the GROUP BY clause that offers a shorthand for defining multiple grouping sets. A grouping set is a set of columns to which you want to group
Different from the CUBE subclause, ROLLUP does not generate all possible grouping sets based on the specified columns. It just makes a subset of those.
The ROLLUP assumes a hierarchy among the input columns and generates all grouping sets that make sense considering the hierarchy. This is the reason why ROLLUP is often used to generate the subtotals and the grand total for reports
For example, the CUBE (c1,c2,c3) makes all eight possible grouping sets:
(c1, c2, c3)
(c1, c2)
(c2, c3)
(c1,c3)
(c1)
(c2)
(c3)
()
However, the ROLLUP(c1,c2,c3) generates only four grouping sets, assuming the hierarchy c1 > c2 > c3 as follows:
(c1, c2, c3)
(c1, c2)
(c1)
()
SELECT
c1,
c2,
c3,
aggregate(c4)
FROM
table_name
GROUP BY
ROLLUP (c1, c2, c3);
Specify the columns (dimensions or dimension columns) which you want to analyze and aggregation function expressions
SELECT
c1,
c2,
c3,
aggregate(c4)
FROM
table_name
GROUP BY
c1,
ROLLUP (c2, c3);
1. Use the ROLLUP subclause to find the number of products sold by brand (subtotal) and by all brands and segments (total)
SELECT
brand,
segment,
SUM (quantity)
FROM
sales
GROUP BY
ROLLUP (brand, segment)
ORDER BY
brand,
segment;
SELECT
segment,
brand,
SUM (quantity)
FROM
sales
GROUP BY
segment,
ROLLUP (brand)
ORDER BY
segment,
brand;
SELECT
EXTRACT (YEAR FROM rental_date) y,
EXTRACT (MONTH FROM rental_date) M,
EXTRACT (DAY FROM rental_date) d,
COUNT (rental_id)
FROM
rental
GROUP BY
ROLLUP (
EXTRACT (YEAR FROM rental_date),
EXTRACT (MONTH FROM rental_date),
EXTRACT (DAY FROM rental_date)
);
Suppose we want to find the films whose rental rate is higher than the average rental rate. We can do it in two steps:
SELECT
AVG (rental_rate)
FROM
film;
The average rental rate is 2.98
SELECT
film_id,
title,
rental_rate
FROM
film
WHERE
rental_rate > 2.98;
To construct a subquery, we put the second query in brackets and use it in the WHERE clause as an expression
SELECT
film_id,
title,
rental_rate
FROM
film
WHERE
rental_rate > (
SELECT
AVG (rental_rate)
FROM
film
);
The query inside the brackets is called a subquery or an inner query. The query that contains the subquery is known as an outer query
PostgreSQL executes the query that contains a subquery in the following sequence:
SELECT
inventory.film_id
FROM
rental
INNER JOIN inventory ON inventory.inventory_id = rental.inventory_id
WHERE
return_date BETWEEN '2005-05-29'
AND '2005-05-30';
It returns multiple rows so we can use this query as a subquery in the WHERE clause of a query
SELECT
film_id,
title
FROM
film
WHERE
film_id IN (
SELECT
inventory.film_id
FROM
rental
INNER JOIN inventory ON inventory.inventory_id = rental.inventory_id
WHERE
return_date BETWEEN '2005-05-29'
AND '2005-05-30'
);
A subquery can be an input of the EXISTS operator. If the subquery returns any row, the EXISTS operator returns true. If the subquery returns no row, the result of EXISTS operator is false
The EXISTS operator only cares about the number of rows returned from the subquery, not the content of the rows, therefore, the common coding convention of EXISTS operator is as follows:
EXISTS (SELECT 1 FROM tbl WHERE condition);
SELECT
first_name,
last_name
FROM
customer
WHERE
EXISTS (
SELECT
1
FROM
payment
WHERE
payment.customer_id = customer.customer_id
);
The query works like an inner join on the customer_id column. However, it returns at most one row for each row in the customer table even though there are some corresponding rows in the payment table
expresion operator ANY(subquery)
SELECT
MAX( length )
FROM
film
INNER JOIN film_category
USING(film_id)
GROUP BY
category_id;
2. Use the previous query as a subquery in the following statement that finds the films whose lengths are greater than or equal to the maximum length of any film category
SELECT title
FROM film
WHERE length >= ANY(
SELECT MAX( length )
FROM film
INNER JOIN film_category USING(film_id)
GROUP BY category_id );
The = ANY is equivalent to IN operator
SELECT
title,
category_id
FROM
film
INNER JOIN film_category
USING(film_id)
WHERE
category_id = ANY(
SELECT
category_id
FROM
category
WHERE
NAME = 'Action'
OR NAME = 'Drama'
);
SELECT
title,
category_id
FROM
film
INNER JOIN film_category
USING(film_id)
WHERE
category_id IN(
SELECT
category_id
FROM
category
WHERE
NAME = 'Action'
OR NAME = 'Drama'
);
x <> ANY (a,b,c)
x <> a OR <> b OR x <> c
comparison_operator ALL (subquery)
With the assumption that the subquery returns some rows, the ALL operator works as follows:
SELECT
ROUND(AVG(length), 2) avg_length
FROM
film
GROUP BY
rating
ORDER BY
avg_length DESC;
2. Find all films whose lengths are greater than the list of the average lengths above by using the ALL and greater than operator (>)
SELECT
film_id,
title,
length
FROM
film
WHERE
length > ALL (
SELECT
ROUND(AVG (length),2)
FROM
film
GROUP BY
rating
)
ORDER BY
length;
EXISTS (subquery)
The EXISTS accepts an argument which is a subquery.
If the subquery returns at least one row, the result of EXISTS is true. In case the subquery returns no row, the result is of EXISTS is false.
EXISTS is often used with the correlated subquery.
The result of EXISTS depends on whether any row returned by the subquery, and not on the content of the rows. Therefore, the columns that appear on the SELECT clause of the subquery are not important.
SELECT
column_1
FROM
table_1
WHERE
EXISTS( SELECT
1
FROM
table_2
WHERE
column_2 = table_1.column_1);
Note that if the subquery returns NULL, the result of EXISTS is true.
SELECT first_name,
last_name
FROM customer c
WHERE EXISTS
(SELECT 1
FROM payment p
WHERE p.customer_id = c.customer_id
AND amount > 11 )
ORDER BY first_name,
last_name;
In this example, for each customer in the customer table, the subquery checks the payment table to find if that customer made at least one payment (p.customer_id = c.customer_id) and the amount is greater than 11 ( amount > 11)
SELECT first_name,
last_name
FROM customer c
WHERE NOT EXISTS
(SELECT 1
FROM payment p
WHERE p.customer_id = c.customer_id
AND amount > 11 )
ORDER BY first_name,
last_name;
NOT EXISTS is opposite to EXISTS, meaning that if the subquery returns no row, NOT EXISTS returns true. If the subquery returns any rows, NOT EXISTS returns false
Find customers who have not made any payment that greater than 11
SELECT
first_name,
last_name
FROM
customer
WHERE
EXISTS( SELECT NULL )
ORDER BY
first_name,
last_name;
The subquery returned NULL, therefore, the query returned all rows from the customer table
If the subquery returns NULL, EXISTS returns true
INSERT INTO table(column1, column2, …)
VALUES
(value1, value2, …);
To add multiple rows into a table at a time, use the following syntax:
INSERT INTO table (column1, column2, …)
VALUES
(value1, value2, …),
(value1, value2, …) ,...;
To insert data that comes from another table, use the INSERT INTO SELECT statement as follows:
INSERT INTO table(column1,column2,...)
SELECT column1,column2,...
FROM another_table
WHERE condition;
CREATE TABLE link (
ID serial PRIMARY KEY,
url VARCHAR (255) NOT NULL,
name VARCHAR (255) NOT NULL,
description VARCHAR (255),
rel VARCHAR (50)
);
INSERT INTO link (url, name)
VALUES
('https://slides.com/fathi-ch3k','PostgreSQL Training');
To insert character data, you must enclose it in single quotes (‘) for example 'PostgreSQL Training'. For the numeric data type, you don’t need to do so, just use plain numbers such as 1, 2, 3.
If you omit any column that accepts the NULL value in the INSERT statement, the column will take its default value. In case the default value is not set for the column, the column will take the NULL value.
PostgreSQL provides a value for the serial column automatically so you do not and should not insert a value into the serial column.
If you want to insert a string that contains a single quote character such as O'Reilly Media, you have to use a single quote (‘) escape character as shown in the following query:
INSERT INTO link (url, name)
VALUES
('http://www.oreilly.com','O''Reilly Media');
INSERT INTO link (url, name)
VALUES
('http://www.google.com','Google'),
('http://www.yahoo.com','Yahoo'),
('http://www.bing.com','Bing');
ALTER TABLE link ADD COLUMN last_update DATE;
ALTER TABLE link ALTER COLUMN last_update
SET DEFAULT CURRENT_DATE;
add a new column named last_update into the link table and set its default value to CURRENT_DATE
inserts a new row with specified date into the link table. The date format is YYYY-MM-DD
INSERT INTO link (url, name, last_update)
VALUES
('http://www.facebook.com','Facebook','2013-06-01');
INSERT INTO link (url, name, last_update)
VALUES
('https://www.fotia.com.my','Fotia',DEFAULT);
You can also use the DEFAULT keyword to set the default value for the date column or any column that has a default value.
CREATE TABLE link_tmp (LIKE link);
1. Create new table
2. Insert rows from the link table whose values of the date column are not NULL
INSERT INTO link_tmp
SELECT
*
FROM
link
WHERE
last_update IS NOT NULL;
INSERT INTO link (url, NAME, last_update)
VALUES('https://www.mampu.gov.my','MAMPU',DEFAULT)
RETURNING id;
Get the last insert id from the table after inserting a new row, use the RETURNING clause in the INSERT statement
UPDATE table
SET column1 = value1,
column2 = value2 ,...
WHERE
condition;
Determine which rows you want to update in the condition of the WHERE clause. If you omit the WHERE clause, all the rows in the table are updated
Use the link table created in the INSERT chapter
Update table partially example
UPDATE link
SET last_update = DEFAULT
WHERE
last_update IS NULL;
UPDATE link
SET rel = 'nofollow';
UPDATE link
SET description = name;
UPDATE link_tmp
SET rel = link.rel,
description = link.description,
last_update = link.last_update
FROM
link
WHERE
link_tmp.id = link.id;
updates values that come from the link table for the columns in the link_tmp table
This kind of UPDATE statement sometimes referred to as UPDATE JOIN or UPDATE INNER JOIN because two or more tables are involved in the UPDATE statement. The join condition is specified in the WHERE clause
UPDATE link
SET description = 'Learn PostgreSQL fast and easy',
rel = 'follow'
WHERE
ID = 1
RETURNING id,
description,
rel;
Updates the row with id 1 in the link table and returns the updated entries
UPDATE A
SET A.c1 = expresion
FROM B
WHERE A.c2 = B.c2;
To join to another table in the UPDATE statement, you specify the joined table in the FROM clause and provide the join condition in the WHERE clause. The FROM clause must appear immediately after the SET clause.
This form of the UPDATE statement updates column value c1 in the table A if each row in the table A and B have a matching value in the column c2
CREATE TABLE product_segment (
ID SERIAL PRIMARY KEY,
segment VARCHAR NOT NULL,
discount NUMERIC (4, 2)
);
INSERT INTO product_segment (segment, discount)
VALUES
('Grand Luxury', 0.05),
('Luxury', 0.06),
('Mass', 0.1);
CREATE TABLE product(
id serial primary key,
name varchar not null,
price numeric(10,2),
net_price numeric(10,2),
segment_id int not null,
foreign key(segment_id) references product_segment(id)
);
INSERT INTO product (name, price, segment_id)
VALUES ('diam', 804.89, 1),
('vestibulum aliquet', 228.55, 3),
('lacinia erat', 366.45, 2),
('scelerisque quam turpis', 145.33, 3),
('justo lacinia', 551.77, 2),
('ultrices mattis odio', 261.58, 3),
('hendrerit', 519.62, 2),
('in hac habitasse', 843.31, 1),
('orci eget orci', 254.18, 3),
('pellentesque', 427.78, 2),
('sit amet nunc', 936.29, 1),
('sed vestibulum', 910.34, 1),
('turpis eget', 208.33, 3),
('cursus vestibulum', 985.45, 1),
('orci nullam', 841.26, 1),
('est quam pharetra', 896.38, 1),
('posuere', 575.74, 2),
('ligula', 530.64, 2),
('convallis', 892.43, 1),
('nulla elit ac', 161.71, 3);
UPDATE product
SET net_price = price - price * discount
FROM
product_segment
WHERE
product.segment_id = product_segment.id;
SELECT
*
FROM
product;
A common table expression is a temporary result set which you can reference within another SQL statement including SELECT, INSERT, UPDATE or DELETE
Common Table Expressions are temporary in the sense that they only exist during the execution of the query
WITH cte_name (column_list) AS (
CTE_query_definition
)
statement;
WITH cte_film AS (
SELECT
film_id,
title,
(CASE
WHEN length < 30 THEN 'Short'
WHEN length >= 30 AND length < 90 THEN 'Medium'
WHEN length > 90 THEN 'Long'
END) length
FROM
film
)
SELECT
film_id,
title,
length
FROM
cte_film
WHERE
length = 'Long'
ORDER BY
title;
CASE
WHEN condition_1 THEN result_1
WHEN condition_2 THEN result_2
[WHEN ...]
[ELSE result_n]
END
We want to assign a price segment to a film with the following logic:
Mass if the rental rate is 0.99
Economic if the rental rate is 1.99
Luxury if the rental rate is 4.99
Count the number of films that belong to the Mass, Economic, and Luxury price segments
SELECT
SUM (
CASE
WHEN rental_rate = 0.99 THEN
1
ELSE
0
END
) AS "Mass",
SUM (
CASE
WHEN rental_rate = 2.99 THEN
1
ELSE
0
END
) AS "Economic",
SUM (
CASE
WHEN rental_rate = 4.99 THEN
1
ELSE
0
END
) AS "Luxury"
FROM
film;
NULLIF(argument_1,argument_2);
NULLIF function returns a null value if argument_1 equals to argument_2, otherwise it returns argument_1
SELECT
NULLIF (1, 1); -- return NULL
SELECT
NULLIF (1, 0); -- return 1
SELECT
NULLIF ('A', 'B'); -- return A
CREATE TABLE members (
ID serial PRIMARY KEY,
first_name VARCHAR (50) NOT NULL,
last_name VARCHAR (50) NOT NULL,
gender SMALLINT NOT NULL -- 1: male, 2 female
);
INSERT INTO members (
first_name,
last_name,
gender
)
VALUES
('John', 'Doe', 1),
('David', 'Dave', 1),
('Bush', 'Lily', 2);
SELECT
(SUM (
CASE
WHEN gender = 1 THEN
1
ELSE
0
END
) / SUM (
CASE
WHEN gender = 2 THEN
1
ELSE
0
END
) ) * 100 AS "Male/Female ratio"
FROM
members;
DELETE
FROM
members
WHERE
gender = 2;
Execute the query to calculate the male/female ratio again, we got the following error message:
[Err] ERROR: division by zero
SELECT
(
SUM (
CASE
WHEN gender = 1 THEN
1
ELSE
0
END
) / NULLIF (
SUM (
CASE
WHEN gender = 2 THEN
1
ELSE
0
END
),
0
)
) * 100 AS "Male/Female ratio"
FROM
members;
This is because, the number of female is zero. To prevent this division by zero error, we use the NULLIF function
CAST ( expression AS target_type );
expression::type
SELECT
'100'::INTEGER,
'01-OCT-2015'::DATE;
Example:
SELECT
CAST ('100' AS INTEGER);
If the expression cannot be converted to the target type, PostgreSQL will raise an error
SELECT
CAST ('10C' AS INTEGER);
[Err] ERROR: invalid input syntax for integer: "10C"
LINE 2: CAST ('10C' AS INTEGER);
SELECT
CAST ('2015-01-01' AS DATE),
CAST ('01-OCT-2015' AS DATE);
If the expression cannot be converted to the target type, PostgreSQL will raise an error
SELECT
CAST ('10C' AS INTEGER);
[Err] ERROR: invalid input syntax for integer: "10C"
LINE 2: CAST ('10C' AS INTEGER);
SELECT
CAST ('10.2' AS DOUBLE);
You need to use DOUBLE PRECISION instead of DOUBLE
SELECT
CAST ('10.2' AS DOUBLE PRECISION);
[Err] ERROR: type "double" does not exist
LINE 2: CAST ('10.2' AS DOUBLE)
SELECT
CAST('true' AS BOOLEAN),
CAST('false' as BOOLEAN),
CAST('T' as BOOLEAN),
CAST('F' as BOOLEAN);
SELECT '2019-06-15 14:30:20'::timestamp;
CREATE TABLE ratings (
ID serial PRIMARY KEY,
rating VARCHAR (1) NOT NULL
);
Create a ratings table
Insert some sample data into the ratings table
INSERT INTO ratings (rating)
VALUES
('A'),
('B'),
('C');
INSERT INTO ratings (rating)
VALUES
(1),
(2),
(3);
Because the requirements change, we use the same ratings table to store ratings as number e.g., 1, 2, 3 instead of A, B, and C
SELECT
*
FROM
ratings;
So the ratings table stores mixed values including numeric and string
SELECT
id,
CASE
WHEN rating~E'^\\d+$' THEN
CAST (rating AS INTEGER)
ELSE
0
END as rating
FROM
ratings;
Convert all values in the rating column into integers, all other A, B, C ratings will be displayed as zero
There are no secrets to success. It is the result of preparation, hard work, and learning from failure. - Colin Powell