Amazon Redshift Serverless - Hands On Demo

Create a data warehouse with Amazon Redshift Serverless

 

  1. Go to Amazon Redshift.
  2. Choose Try Amazon Redshift Serverless.
  3. Under Configuration, choose Use default settings.
  4. Amazon Redshift Serverless creates a default namespace with a default workgroup associated to this namespace.
  5. Choose Save configuration.
  6. Keep all the values as default.

Load Sample Data

 

  1. Click "Query Data".
  2. When you invoke query editor v2 from the Amazon Redshift Serverless console, a new browser tab opens with the query editor.
  3. The query editor v2 connects from your client machine to the Amazon Redshift Serverless environment.
  4. Click Connect in "Serverless: default-workgroup".
  5. When connecting to a new workgroup for the first time within query editor v2, you must select the type of authentication to use to connect to the workgroup. For this guide, leave Federated user selected, and choose Create connection.
  6. Under the Amazon Redshift Serverless default workgroup, expand the sample_data_dev database. There are three sample schemas corresponding to three sample datasets that you can load into the Amazon Redshift Serverless database. Choose the sample dataset that you want to load, and choose Open sample notebooks.
  7. When loading data for the first time, query editor v2 will prompt you to create a sample database. Choose Create.

Run Sample Queries

 

  1. Once Amazon Redshift Serverless finishes loading the sample data, all of the sample queries are loaded in the editor. You can choose Run all to run all of the queries from the sample notebooks.
  2. You can also export the results as a JSON or CSV file or view the results in a chart.

Create the default IAM role

 

  1. Under Redshift Serverless Dashboard, Choose Namespace configuration from the navigation menu, and then choose Security and encryption. Then, choose Manage IAM roles.
  2. Expand the Manage IAM roles menu, and choose Create IAM role.
  3. Choose "Any S3 bucket" bucket access, and choose Create IAM role as default.
  4. Choose Save changes. 

Create Table in Redshift Serverless Database

 

  1. In query editor v2, choose  Add, then choose Notebook to create a new SQL notebook.
  2. Switch to the dev database.
  3. Create tables. Copy and run the following create table statements to create tables in the dev database.
create table users(
userid integer not null distkey sortkey,
username char(8),
firstname varchar(30),
lastname varchar(30),
city varchar(30),
state char(2),
email varchar(100),
phone char(14),
likesports boolean,
liketheatre boolean,
likeconcerts boolean,
likejazz boolean,
likeclassical boolean,
likeopera boolean,
likerock boolean,
likevegas boolean,
likebroadway boolean,
likemusicals boolean);                        

create table event(
eventid integer not null distkey,
venueid smallint not null,
catid smallint not null,
dateid smallint not null sortkey,
eventname varchar(200),
starttime timestamp);

create table sales(
salesid integer not null,
listid integer not null distkey,
sellerid integer not null,
buyerid integer not null,
eventid integer not null,
dateid smallint not null sortkey,
qtysold smallint not null,
pricepaid decimal(8,2),
commission decimal(8,2),
saletime timestamp);

Copy Data from S3 Public Buckets to Redshift Tables

 

  1. In the query editor v2, create a new SQL cell in your notebook.
  2. Copy and Run the following command.
COPY users 
FROM 's3://redshift-downloads/tickit/allusers_pipe.txt' 
DELIMITER '|' 
TIMEFORMAT 'YYYY-MM-DD HH:MI:SS'
IGNOREHEADER 1 
REGION 'us-east-1'
IAM_ROLE default;                    
                    
COPY event
FROM 's3://redshift-downloads/tickit/allevents_pipe.txt' 
DELIMITER '|' 
TIMEFORMAT 'YYYY-MM-DD HH:MI:SS'
IGNOREHEADER 1 
REGION 'us-east-1'
IAM_ROLE default;

COPY sales
FROM 's3://redshift-downloads/tickit/sales_tab.txt' 
DELIMITER '\t' 
TIMEFORMAT 'MM/DD/YYYY HH:MI:SS'
IGNOREHEADER 1 
REGION 'us-east-1'
IAM_ROLE default;

Run Sample Queries on S3

 

  1. After loading data, create another SQL cell in your notebook and try some example queries. 
-- Find top 10 buyers by quantity.
SELECT firstname, lastname, total_quantity 
FROM   (SELECT buyerid, sum(qtysold) total_quantity
        FROM  sales
        GROUP BY buyerid
        ORDER BY total_quantity desc limit 10) Q, users
WHERE Q.buyerid = userid
ORDER BY Q.total_quantity desc;

-- Find events in the 99.9 percentile in terms of all time gross sales.
SELECT eventname, total_price 
FROM  (SELECT eventid, total_price, ntile(1000) over(order by total_price desc) as percentile 
       FROM (SELECT eventid, sum(pricepaid) total_price
             FROM   sales
             GROUP BY eventid)) Q, event E
       WHERE Q.eventid = E.eventid
       AND percentile = 1
ORDER BY total_price desc;

Thanks

For

Watching

Amazon Redshift Serverless - Hands On Demo

By Deepak Dubey

Amazon Redshift Serverless - Hands On Demo

Amazon Redshift Serverless - Hands On Demo

  • 60