Linux Training for SAS Developers

Hello.

I am a geek.

I use SAS & Linux all day.

I build software for it.

I rarely teach.

 

(Wow aren't you lucky)

We are here

 

 

ssid

[removed]

password

[removed]

 

The Plan


Day 1

Course (mostly)

standard linux 101

 

Day 2

Workshop (mostly)

objective driven

Checklist:

 

Comms on Slack #linuxtraining

[removed]

Shell access to [removed].boemskats.com

[removed].boemskats.com

SAS Studio access to [removed].boemskats.com

http://[removed].boemskats.com/SASStudio/

asciinema as a learning / self documentation tool

Most places teach this as a 3 to 5 day course, without the SAS bit. 

 

So it's cut down quite a lot to the relevant bits.

 

For now anyway.

 

Focused on trying & doing. Interrupt me if I gloss over something that I obviously shouldn't. 

Disclaimer:

 

Apparently kids in Yr3 do this. So we can too.

Come Up With Your Own
Success Criteria

What do you want to be able to do by the end of this?

Concrete tasks. We'll refer back to these.

Getting Started

 

Connectivity

The Terminal

The Shell

The Unix 'Philosophy'

Looking for Help

Connecting to Linux

ssh

Stands for Secure Shell

Protocol runs on TCP Port 22

Encrypted Connection

Normally use PuTTY

or ssh between machines

 

scp

Stands for Secure Copy

Runs on SSH protocol

You'll use WinSCP

or scp between machines

 

 

http

Because your work VPNs have crappy firewalls, we've set up forwarding on 

https://[removed].boemskats.com

The Terminal

ssh

Opens encrypted connection

Prompts for a login/password

once successful, will start a shell

Try:

Log on to [removed].boemskats.com from the web

Log on to [removed].boemskats.com from another shell (demo)

Log on to [removed].boemskats.com from putty

 

accepts keystrokes and sends them

receives output and displays it 

(everyone should be logged on)

The Shell

Linux Command Line

Where commands are invoked

Type a command, hit Enter

cmd.exe is analogous

Try typing some commands.

date, whoami, who, pwd, uname, uptime

(exit)

It is a User Environment

Intended to be an Interactive Workspace

Has some nice things 

The Nice Things I mentioned

 

Some features to keep in mind as we go on:

tab completion (files and commands)

history, up/down arrow

^change^tothis

Keyboard shortcuts (ctrl-R, ctrl-W, ctrl-H, ctrl-C)

Command substitution

Comments

The UNIX 'Philosophy'

Multiple Users

Everyone needs an account

Everyone needs a login

Everyone has their own files and config

Multiple Components

Each performs a task

Can be combined

Interchangeable and independent

Minimalist

Seeking and Getting Help

 

Built In

man command

command --help

Google

StackOverflow

nixCraft

Wikipedia

use keywords

avoid why/what/how 

Basic Commands & Streams

 

Commands

Syntax

Parameters / Arguments

Input and Output

Pipes and Redirects

 

Run some commands

go on

date

(return the date)

echo

(return the input)

hostname

(returns hostname)

passwd

(changes password)

clear

(clears terminal)

history

(return the shell history)

Commands & Syntax

Most commands accept Parameters

Also called Arguments.

Some commands require them.

 

UNIX is Case Sensitive

Most commands are lowercase.

 

Add some parameters

date -I

(return the ISO date)

man date

(not like that)

hostname -s

(returns short hostname)

uptime -p

(uptime in 'pretty format')

echo {con,pre}{sent,fer}{s,ed}

(just because bash brace expansion is cool)

Try these too.

Command Input and Output

stdin

(standard input)

stderr

(standard error)

stdout

(standard output)

How all linux commands work, a concept that's carried over from the days of remote terminals (just a screen and keyboard)

 

Programs read data from their standard input file. Default stdin is keyboard, but can be redirected.

 

Standard output & error go to terminal, but can also be redirected.

Pipes

Again fundamental to Unix Philosophy, all programs do this. 

(even sas -stdio)

 

stdout of one program feeds stdin of the next (chaining).

 

Command Syntax uses the vertical bar:

|

 

Chaining commands using Pipes

who | sort -r

(show who is logged on and then sorts lines)

echo Alright World? | rev

(passes Alright World? as stdin to rev which reverses it)

ps aux | grep sas

(lists all running processes and then filters on sas)

Stream Redirection

stdout and stderr streams can also be redirected to a file.

 

This is done with the > operator (greater than).

 

echo Hello! > myfile.txt

The contents of a file can be redirected to the stdin stream of a program.

 

This is done with the < operator (less than).

 

rev < myfile.txt

 

They can be used at the same time:

rev < myfile.txt > reversedFile.txt

Appending to files

 

Redirecting to a file using the > command always creates a new file.

 

To append the output of a command to the end of a file while keeping previous contents, use >>.

 

rev < myfile.txt >> reversedFiles.txt

 

This will also create the output file if it doesn't already exist.

Exercise?

Create a file by redirecting to it.

 

Append some echoed text to it.

 

Use pipes or redirects to read the contents of that file and write it to a second file

 

 

Ready for SAS Studio at this point.

Files and Directories

 

Directory Structure

Relative and Absolute Paths

Navigating the Filesystem

Creating Directories

Permissions

Changing Ownership

Users

Home Directories

 

Directory Structure

 

Same concept as Windows - a hierarchical collection of files and/or other directories, with some differences:

 

  • Linux uses forward slash (/) instead of backslash (\
  • No drive letters, just one root directory, /
  • Paths are case sensitive
  • Files can be more than just files
    (symbolic links, devices, special kernel 'files')

 

Absolute and Relative Paths

 

 

Again similar to Windows, an absolute path is the full path to a file or directory. For example, /usr/share/doc always points to the shared documentation directory.

 

A relative path refers to a file from the perspective of the current directory. Simple example:

 

cd /usr navigates to /usr using its absolute path

cd share/doc then navigates to /usr/share/doc using relative path   

Special Paths

Your shell has a Current Directory, where you are currently working. pwd returns your current directory to stdout.

Commands like ls also use the current directory as default input. 

The special relative paths . and .. refer to current directory and its parent directory respectively (same as in Windows).

The special path ~ always refers to your home directory. 

Try this:

The cd command, by default, will take you to your home directory. cd and cd ~ will both navigate to your home directory.

Type cd . or cd ./ and you will stay where you are. Type cd .. and you will navigate to /home. Check where you are with pwd.   

Directories & Navigation

Useful arguments / one-liners:

cd - returns to your previous directory

ls -la lists (a)ll files in (l)ong format

mkdir -p creates a full directory path, including all (p)arents

rm -r deletes non-empty directories (r)ecursively

cd

(changes directory)

ls

(lists directory contents)

mkdir

(creates directory)

rmdir

(removes directory)

Overview of 'Standard' Linux Directory Structure

'Filesystem Hierarchy Standard' is a real standard that's been around for ages. maintained by the Linux Foundation.

(just open the wallpaper, it's better)

Mini Exercise

Look around, navigate to and list some of the directories we talked about.

 

Make some directories in your home directory. Make some recursively.

 

Use tree to visualise them.

 

Then clean them up. You don't need them.

Filesystem Permissions

ls -l returns a listing of files in this format:
 

drwxrwxr-x.  2 nik  nik       4096 Mar  5 13:28 adirectory
-rw-rw-r--.  1 nik  nik        187 Aug 29  2015 someotherfile.txt
-rwxrw-r--.  1 nik  sas     213835 Jul 13  2015 athirdfile.txt
-rwxrw-r--

owner

group

everyone
else

directory

read

write

execute

File and Directory Permissions

Read
Permission to read the data stored in the file


Write
Permission to write new data to the file, to truncate the file, or to overwrite existing data

 

Execute

Permission to attempt to execute the contents of the file as a program

Read
Permission to get a listing of the directory


Write
Permission to create, delete, or rename files (or subdirectories) within the directory

 

Execute

Permission to change to the directory, or use the directory as an intermediate part of a path to a file

Changing Ownership and Permissions

chmod

(changes permissions of a file or dir)

chown

(changes the owner of a file or dir)

chgrp

(changes the group owner of a file or dir)

Permissions Notation

Symbolic Notation

read

write

execute

Numeric Notation

4

2

1

rwx is 4+2+1=7
rw- is 4+2+0=6
r-- is 4+0+0=4
r-x is 4+0+1=5
(etc)

Notation can also be Numeric.

chmod

a file's permissions can only be changed by its owner (or root)

chmod is flexible, but can seem complex

chmod a+x myprogram

adds (+) e(x)ecute permissions for (a)ll users on file myprogram

 [ugoa][+=-][rwxX]

u is owner

g is group

o is other

a is all users

= sets perms

+ adds to existing

- removes from existing

 

rwx as above

X only sets x for directories, or files already executable

chmod

Numeric masks can also be used to set (=) permissions

chmod 755 myprogram

(refer back to notation)

recursive changes:

chmod -R a=rx,u=rwx myprogram

Special directory permission: sticky bit

chmod +t mydirectory

makes it so that only a file's owner can delete it from a shared sticky dir

chown

Only root can change ownership of a file

chown nik myprogram

changes myprogram so it is owned by nik

chgrp

Only owner can change group membership of a file, and only if they are a member of that group

chown sas myprogram

changes myprogram so it is owned by group sas

Users

(briefly, just because we haven't really covered it)

Everyone has to be a user, and have a username.

Users can belong to groups.

Users have home directories.

Home Directories

Owned by Users and Private (700 or d rwx --- ---)

Again, go there with cd and refer to it with ~ for relative paths.

exercise - make some public files

that everyone can read but only you can edit

make a directory where people can add their own files but can't delete those of others

try to delete someone else's files in your directory

Do some stuff

Working with Files

 

What are files

Common file operations

Opening / Inspecting

Searching within files

Executables and Executing

 

Filesystem Objects

(bit of a recap)

a file is place to store data - a (possibly empty) sequence of bytes

a directory is a collection of files or other directories

together they're organised into the filesystem

each file or directory can be referenced with an absolute path (starting with /) or with a relative path (from some current directory)

Extensions

Similar to Windows - it is common to put a filename extension, beginning with a dot (.) on the end of the filename. 

However here they are informational and a user convention, treated no differently by the OS. A few programs use them. 

Filename completion

Remember - tap Tab twice to get filename completion.

Using Wildcards

Use wildcards to refer to multiple files when specifying them

use * to match any part of a filename

ls *.txt

returns
myfile1.txt  myfile12.txt myfile3.txt

use ? to match a single character in a filename

ls myfile?.txt

returns
myfile1.txt  myfile3.txt

Wildcards are handled by the shell and the actual filenames are passed to the program without it knowing.

File Operations

cp

(copies files)

mv

(moves, or 'renames' files)

rm

(deletes files)

grep

(searches the contents of a file
or list of files)

file

(identifies files types)

rmdir

(deletes directories)

cp

cp [options] sourcefile destinationfile

(copies source file to destination file)

cp [options] file1 file2 file3 destinationdir/

(copies source files 1 to 3 to to destination directory)

Common Options:

-f forces overwriting of destination files without prompting

-i interactively prompts before overwriting each file

-r or -R copies directories and contents recursively

 

mv

renames files or directories, or moves them to a different location  

mv mycode.sas myoldcode.sas

(example of a 'rename' like operation)

Common Options:

-f forces overwriting of destination files without prompting

-i interactively prompts before overwriting target

mv *.sas ~/myPrograms/

(example of how to move multiple files using a wildcard to a directory in your home dir using relative home path)

rm

Deletes, or removes the specified files. There is no undo or recycle bin here. You must have write permission on the directory to remove file

rm -rf ~/hateThisDirectorySoMuch

Permanently removes that directory you really don't like any more. And everything in it. Without asking.

Common Options:

-f forces deletion of write-protected files without prompting

-i interactively prompts before deleting each file

-r recursively deletes files and directories

rm -i myProgram_v*.sas

prompts you for each file where not all matching files are to be deleted

grep

searching contents of a file for a string

-i case insensitive

-r recursive

-l prints just names 

-n numbers the output

-v reverses test (not matching)

grep drjim /etc/passwd

grep 'ˆdrjim' /etc/passwd

pattern matching beginning

'drjim$' pattern match end

 

Regular expressions

file

simple utility, tries to guess the file type

try it with SAS Foundation executable

file /pub/sas/SASFoundation/9.4/sas

[drjim@apps ~]$ file dirs
dirs: empty
[drjim@apps ~]$ file things
things: ASCII text
[drjim@apps ~]$

more or less it

 

Exercises

 

Do some actual relevant stuff with SAS now. 

 

Access to playpen?

Editing, Executing and Variables

 

How to run an executable

Editing files and scripts (executables)

Command Substitution

Environment Variables

Recursion and Shell Programming

Editing files

nano [file]

vi [file]

nano exists

vim is better

emacs can be ignored

(let's just have a look at both)

Running an executable

call it by it's direct path

/pub/sas/SASFoundation/9.4/sas -nodms

if it is in the $PATH environment variable (win too)

sas -nodms

If there are conflicts between the two

which sas

Must have execute bit obviously

 

Environment Variables

similar to windows

 

Show variables which are set:

env

Use a variable:

$VAR or ${VAR}

$SASHOME/sas

Set a variable:

VAR=123 local

export VAR=123 global

Append to a variable:

PATH=$PATH:/somewhere/else

Have a look at your

~/.bashrc

(the 'autoexec')

Spend some time on this. SAS uses them a lot.

Command Substitution

Same as how you would use a variable in a command line.

old way: ping `hostname`

better way: ping $(hostname)

what's different:

ping $(hostname)

ping ${HOSTNAME}

Recursion & Conditionals

The idea of bash programming

if [ -f /etc/bashrc ];
  then . /etc/bashrc
fi

Testing for file existence

for file in *.txt; do
  mv -v $file $file.old; 
done

Looping

Processes

 

What is a process

Listing processes

Forking, Foreground, Background, Pausing

Signalling (killing)(!)

What is a Process

A running program is considered a process

'Lives' as it executes, 'dies' as it terminates

Each living process has a Process ID (pid)

Each process has a user id (uid) and group id (gid).

These dictate a user permissions on that process and that process' permissions on the filesystem.

What is a Process

Processes have a parent process, and a parent process ID (ppid).

The kernel starts a process called init with pid of 1. Each other process is a descendant and rest of the branching is called the process tree.

Each process has its own working directory, initially inheriting that of its parent process

Each process has its own environment, complete with environment variables, initially inheriting that of its parent process

Process Operations

ps

(all processes currently running)

htop

(top but way better)

top

(full screen, updates constantly)

pstree

(return the process tree)

esm

(the one and only)

pidof

(gets pid(s) of a given executable)

Process Management

(signalling)

kill

(sends signal to
specified process(es))

pkill

(sends signal to
process(es) matching a pattern)

killall

(sends signal to all process matching the specified command)

this is
obviously an

important thing

Common Signals

 

2: SIGINT

Interrupt - stop. Sent by the kernel when Ctrl-C is pressed in a terminal.

15: SIGTERM

Terminate - Please stop. Be graceful about it.

9: SIGKILL

DIE! DIE NOW!

No you can't clean up after yourself.

18: SIGTSTP

Temporary Stop - Please stop temporarily. Sent with Ctrl-Z.

Signalling a process

 

kill -15 4234 12395

Sends SIGTERM to processes 4234 and 12395

killall -9 -u sassrv

Sends SIGKILL to all spawned sassrv processes

pkill -15 sas

Sends SIGTERM to all instances of sas program 

kill -18 123

Sends SIGSTP to 123 suspending it

Background, Foreground, Pause

runMe &

Forks runMe to background and returns job number

fg

Returns a background process to the foreground given job no. Also resumes suspended process.

bg

Resumes a suspended job to the background

jobs

Lists all backgrounded jobs

Scheduling a process to run later

 

 

at command

 

no access to cron probably

 

You'll probably use LSF.

Advanced Pipes and special files and stuff

 

If you get this far you've gone way too fast.

 

stdin, stdout, stderr

compressing files

filtering output

piping through ssh