PHC7065 CRITICAL SKILLS IN DATA MANIPULATION FOR POPULATION SCIENCE

Many-to-Many Relationships in SQL
Hui Hu Ph.D.
Department of Epidemiology
College of Public Health and Health Professions & College of Medicine
February 10, 2020
Many-to-Many Relationships
 
Lab: SQL Part 3
Many-to-Many Relationships
Basic JOIN
1
2
3
C
A
B
INNER JOIN
=
2
3
A
B
1
2
3
C
A
B
LEFT JOIN
=
2
3
A
B
1
Basic JOIN
1
2
3
C
A
B
RIGHT JOIN
=
2
3
A
B
1
2
3
C
A
B
<<FULL>> JOIN
=
2
3
A
B
1
C
C
CROSS JOIN
| Day | 
|---|
| Jan 01 | 
| Jan 02 | 
| ... | 
| Jan 30 | 
| Jan 31 | 
| Individual | 
|---|
| Ind 1 | 
| Ind 2 | 
| Ind 3 | 
CROSS JOIN
=
| Day | Individual | 
|---|---|
| Jan 01 | Ind 1 | 
| Jan 01 | Ind 2 | 
| Jan 01 | Ind 3 | 
| Jan 02 | Ind 1 | 
| Jan 02 | Ind 2 | 
| Jan 02 | Ind 3 | 
| ... | ... | 
| Jan 31 | Ind 1 | 
| Jan 31 | Ind 2 | 
| Jan 31 | Ind 3 | 
Theta JOIN
| Day | 
|---|
| Jan 01 | 
| Jan 02 | 
| ... | 
| Jan 30 | 
| Jan 31 | 
| Individual | Int_day | 
|---|---|
| Ind 1 | Jan 10 | 
| Ind 2 | Jan 11 | 
| Ind 3 | Jan 12 | 
JOIN
=
| Day | Individual | 
|---|---|
| Jan 10 | Ind 1 | 
| Jan 11 | Ind 1 | 
| Jan 11 | Ind 2 | 
| Jan 12 | Ind 1 | 
| Jan 12 | Ind 2 | 
| Jan 12 | Ind 3 | 
| ... | ... | 
| Jan 31 | Ind 1 | 
| Jan 31 | Ind 2 | 
| Jan 31 | Ind 3 | 
ON Day>=Int_day
UNION
| Day | 
|---|
| Jan 01 | 
| Jan 02 | 
| ... | 
| Jan 30 | 
| Jan 31 | 
| Day | 
|---|
| Feb 1 | 
| Feb 2 | 
| Feb 3 | 
| Jan 01 | 
UNION
=
| Day | 
|---|
| Jan 1 | 
| Jan 2 | 
| Jan 3 | 
| ... | 
| Jan 31 | 
| Feb 1 | 
| Feb 2 | 
| Feb 3 | 
UNION ALL
| Day | 
|---|
| Jan 01 | 
| Jan 02 | 
| ... | 
| Jan 30 | 
| Jan 31 | 
| Day | 
|---|
| Feb 1 | 
| Feb 2 | 
| Feb 3 | 
| Jan 01 | 
UNION ALL
=
| Day | 
|---|
| Jan 1 | 
| Jan 2 | 
| Jan 3 | 
| ... | 
| Jan 31 | 
| Feb 1 | 
| Feb 2 | 
| Feb 3 | 
| Jan 01 | 
Some Useful String Functions
LEFT(string, number of characters)RIGHT(string, number of characters)LENGTH(string)TRIM(leading, 'A' FROM string)TRIM(trailing, 'A' FROM string)TRIM(both, 'A' FROM string)POSITION('A' IN string)STRPOS(string,'A')SUBSTR(string, starting position, number of characters)CONCAT(string1,string2)Useful Functions for Time Varaibles
SELECT date,
       (SUBSTR(date, 7, 4) || '-' || LEFT(date, 2) ||
        '-' || SUBSTR(date, 4, 2))::DATE AS cleaned_date
  FROM tab'02/03/2017'->2017-02-03
SELECT cleaned_date,
       EXTRACT(YEAR   FROM cleaned_date) AS year,
       EXTRACT(MONTH  FROM cleaned_date) AS month,
       EXTRACT(DAY    FROM cleaned_date) AS day,
       EXTRACT(HOUR   FROM cleaned_date) AS hour,
       EXTRACT(MINUTE FROM cleaned_date) AS minute,
       EXTRACT(SECOND FROM cleaned_date) AS second,
       EXTRACT(DECADE FROM cleaned_date) AS decade,
       EXTRACT(DOW    FROM cleaned_date) AS day_of_week
  FROM tabSELECT cleaned_date,
       DATE_TRUNC(YEAR   , cleaned_date) AS year,
       DATE_TRUNC(MONTH  , cleaned_date) AS month,
       DATE_TRUNC(WEEK   , cleaned_date) AS week,
       DATE_TRUNC(DAY    , cleaned_date) AS day,
       DATE_TRUNC(HOUR   , cleaned_date) AS hour,
       DATE_TRUNC(MINUTE , cleaned_date) AS minute,
       DATE_TRUNC(SECOND , cleaned_date) AS second,
       DATE_TRUNC(DECADE , cleaned_date) AS decade
  FROM tabCOALESCE
SELECT incidnt_num,
       descript,
       COALESCE(descript, 'No Description')
  FROM tutorial.sf_crime_incidents_cleandate
 ORDER BY descript DESCReplace NULL with other value
Subqueries
SELECT customerNumber,
       checkNumber,
       amount
FROM payments
WHERE amount = (
                 SELECT MAX(amount) 
                 FROM payments
                );SELECT
 MAX(items),
 MIN(items),
 FLOOR(AVG(items))
FROM
 (
 SELECT
 orderNumber,
 COUNT(orderNumber) AS items
 FROM
 orderdetails
 GROUP BY
 orderNumber
 ) AS lineitems;SQL Part 3
git pull
PHC7065-Spring2020-Lecture5
By Hui Hu
PHC7065-Spring2020-Lecture5
Slides for Lecture 5, Spring 2020, PHC7065 Critical Skills in Data Manipulation for Population Science
- 934
 
   
   
  