Hello.
I am a geek.
I use SAS & Linux all day.
I build software for it.
I rarely teach.
(Wow aren't you lucky)
ssid
password
Day 1
standard linux 101
Day 2
objective driven
[removed]
Most places teach this as a 3 to 5 day course, without the SAS bit.
So it's cut down quite a lot to the relevant bits.
For now anyway.
Focused on trying & doing. Interrupt me if I gloss over something that I obviously shouldn't.
Apparently kids in Yr3 do this. So we can too.
What do you want to be able to do by the end of this?
Concrete tasks. We'll refer back to these.
Connectivity
The Terminal
The Shell
The Unix 'Philosophy'
Looking for Help
Connecting to Linux
Stands for Secure Shell
Protocol runs on TCP Port 22
Encrypted Connection
Normally use PuTTY
or ssh between machines
Stands for Secure Copy
Runs on SSH protocol
You'll use WinSCP
or scp between machines
Because your work VPNs have crappy firewalls, we've set up forwarding on
The Terminal
Opens encrypted connection
Prompts for a login/password
once successful, will start a shell
Try:
Log on to [removed].boemskats.com from the web
Log on to [removed].boemskats.com from another shell (demo)
Log on to [removed].boemskats.com from putty
accepts keystrokes and sends them
receives output and displays it
The Shell
Where commands are invoked
Type a command, hit Enter
cmd.exe is analogous
Try typing some commands.
date, whoami, who, pwd, uname, uptime
(exit)
Intended to be an Interactive Workspace
Has some nice things
The Nice Things I mentioned
Some features to keep in mind as we go on:
tab completion (files and commands)
history, up/down arrow
^change^tothis
Keyboard shortcuts (ctrl-R, ctrl-W, ctrl-H, ctrl-C)
Command substitution
Comments
The UNIX 'Philosophy'
Everyone needs an account
Everyone needs a login
Everyone has their own files and config
Each performs a task
Can be combined
Interchangeable and independent
Minimalist
Seeking and Getting Help
man command
command --help
StackOverflow
nixCraft
Wikipedia
use keywords
avoid why/what/how
Commands
Syntax
Parameters / Arguments
Input and Output
Pipes and Redirects
go on
(return the date)
(return the input)
(returns hostname)
(changes password)
(clears terminal)
(return the shell history)
Commands & Syntax
Also called Arguments.
Some commands require them.
Most commands are lowercase.
(return the ISO date)
(not like that)
(returns short hostname)
(uptime in 'pretty format')
(just because bash brace expansion is cool)
Try these too.
(standard input)
(standard error)
(standard output)
How all linux commands work, a concept that's carried over from the days of remote terminals (just a screen and keyboard)
Programs read data from their standard input file. Default stdin is keyboard, but can be redirected.
Standard output & error go to terminal, but can also be redirected.
Again fundamental to Unix Philosophy, all programs do this.
(even sas -stdio)
stdout of one program feeds stdin of the next (chaining).
Command Syntax uses the vertical bar:
|
(show who is logged on and then sorts lines)
(passes Alright World? as stdin to rev which reverses it)
(lists all running processes and then filters on sas)
stdout and stderr streams can also be redirected to a file.
This is done with the > operator (greater than).
echo Hello! > myfile.txt
The contents of a file can be redirected to the stdin stream of a program.
This is done with the < operator (less than).
rev < myfile.txt
They can be used at the same time:
rev < myfile.txt > reversedFile.txt
Redirecting to a file using the > command always creates a new file.
To append the output of a command to the end of a file while keeping previous contents, use >>.
rev < myfile.txt >> reversedFiles.txt
This will also create the output file if it doesn't already exist.
Create a file by redirecting to it.
Append some echoed text to it.
Use pipes or redirects to read the contents of that file and write it to a second file
Ready for SAS Studio at this point.
Directory Structure
Relative and Absolute Paths
Navigating the Filesystem
Creating Directories
Permissions
Changing Ownership
Users
Home Directories
Same concept as Windows - a hierarchical collection of files and/or other directories, with some differences:
Again similar to Windows, an absolute path is the full path to a file or directory. For example, /usr/share/doc always points to the shared documentation directory.
A relative path refers to a file from the perspective of the current directory. Simple example:
cd /usr navigates to /usr using its absolute path
cd share/doc then navigates to /usr/share/doc using relative path
Your shell has a Current Directory, where you are currently working. pwd returns your current directory to stdout.
Commands like ls also use the current directory as default input.
The special relative paths . and .. refer to current directory and its parent directory respectively (same as in Windows).
The special path ~ always refers to your home directory.
The cd command, by default, will take you to your home directory. cd and cd ~ will both navigate to your home directory.
Type cd . or cd ./ and you will stay where you are. Type cd .. and you will navigate to /home. Check where you are with pwd.
Useful arguments / one-liners:
cd - returns to your previous directory
ls -la lists (a)ll files in (l)ong format
mkdir -p creates a full directory path, including all (p)arents
rm -r deletes non-empty directories (r)ecursively
(changes directory)
(lists directory contents)
(creates directory)
(removes directory)
'Filesystem Hierarchy Standard' is a real standard that's been around for ages. maintained by the Linux Foundation.
(just open the wallpaper, it's better)
Look around, navigate to and list some of the directories we talked about.
Make some directories in your home directory. Make some recursively.
Use tree to visualise them.
Then clean them up. You don't need them.
ls -l returns a listing of files in this format:
drwxrwxr-x. 2 nik nik 4096 Mar 5 13:28 adirectory -rw-rw-r--. 1 nik nik 187 Aug 29 2015 someotherfile.txt -rwxrw-r--. 1 nik sas 213835 Jul 13 2015 athirdfile.txt
-rwxrw-r--
owner
group
everyone
else
directory
read
write
execute
Read
Permission to read the data stored in the file
Write
Permission to write new data to the file, to truncate the file, or to overwrite existing data
Execute
Permission to attempt to execute the contents of the file as a program
Read
Permission to get a listing of the directory
Write
Permission to create, delete, or rename files (or subdirectories) within the directory
Execute
Permission to change to the directory, or use the directory as an intermediate part of a path to a file
(changes permissions of a file or dir)
(changes the owner of a file or dir)
(changes the group owner of a file or dir)
Symbolic Notation
read
write
execute
Numeric Notation
4
2
1
rwx is 4+2+1=7 rw- is 4+2+0=6 r-- is 4+0+0=4 r-x is 4+0+1=5 (etc)
Notation can also be Numeric.
a file's permissions can only be changed by its owner (or root)
chmod is flexible, but can seem complex
adds (+) e(x)ecute permissions for (a)ll users on file myprogram
u is owner
g is group
o is other
a is all users
= sets perms
+ adds to existing
- removes from existing
rwx as above
X only sets x for directories, or files already executable
Numeric masks can also be used to set (=) permissions
(refer back to notation)
recursive changes:
Special directory permission: sticky bit
makes it so that only a file's owner can delete it from a shared sticky dir
Only root can change ownership of a file
changes myprogram so it is owned by nik
Only owner can change group membership of a file, and only if they are a member of that group
changes myprogram so it is owned by group sas
(briefly, just because we haven't really covered it)
Everyone has to be a user, and have a username.
Users can belong to groups.
Users have home directories.
Owned by Users and Private (700 or d rwx --- ---)
Again, go there with cd and refer to it with ~ for relative paths.
exercise - make some public files
that everyone can read but only you can edit
make a directory where people can add their own files but can't delete those of others
try to delete someone else's files in your directory
What are files
Common file operations
Opening / Inspecting
Searching within files
Executables and Executing
(bit of a recap)
a file is place to store data - a (possibly empty) sequence of bytes
a directory is a collection of files or other directories
together they're organised into the filesystem
each file or directory can be referenced with an absolute path (starting with /) or with a relative path (from some current directory)
Similar to Windows - it is common to put a filename extension, beginning with a dot (.) on the end of the filename.
However here they are informational and a user convention, treated no differently by the OS. A few programs use them.
Remember - tap Tab twice to get filename completion.
Use wildcards to refer to multiple files when specifying them
use * to match any part of a filename
ls *.txt
returns
myfile1.txt myfile12.txt myfile3.txt
use ? to match a single character in a filename
ls myfile?.txt
returns
myfile1.txt myfile3.txt
Wildcards are handled by the shell and the actual filenames are passed to the program without it knowing.
(copies files)
(moves, or 'renames' files)
(deletes files)
(searches the contents of a file
or list of files)
(identifies files types)
(deletes directories)
(copies source file to destination file)
(copies source files 1 to 3 to to destination directory)
-f forces overwriting of destination files without prompting
-i interactively prompts before overwriting each file
-r or -R copies directories and contents recursively
renames files or directories, or moves them to a different location
(example of a 'rename' like operation)
-f forces overwriting of destination files without prompting
-i interactively prompts before overwriting target
(example of how to move multiple files using a wildcard to a directory in your home dir using relative home path)
Deletes, or removes the specified files. There is no undo or recycle bin here. You must have write permission on the directory to remove file
Permanently removes that directory you really don't like any more. And everything in it. Without asking.
-f forces deletion of write-protected files without prompting
-i interactively prompts before deleting each file
-r recursively deletes files and directories
prompts you for each file where not all matching files are to be deleted
searching contents of a file for a string
-i case insensitive
-r recursive
-l prints just names
-n numbers the output
-v reverses test (not matching)
grep 'ˆdrjim' /etc/passwd
pattern matching beginning
'drjim$' pattern match end
Regular expressions
simple utility, tries to guess the file type
try it with SAS Foundation executable
file /pub/sas/SASFoundation/9.4/sas
[drjim@apps ~]$ file dirs dirs: empty [drjim@apps ~]$ file things things: ASCII text [drjim@apps ~]$
Do some actual relevant stuff with SAS now.
Access to playpen?
How to run an executable
Editing files and scripts (executables)
Command Substitution
Environment Variables
Recursion and Shell Programming
nano [file]
vi [file]
nano exists
vim is better
emacs can be ignored
(let's just have a look at both)
call it by it's direct path
/pub/sas/SASFoundation/9.4/sas -nodms
if it is in the $PATH environment variable (win too)
sas -nodms
If there are conflicts between the two
which sas
Must have execute bit obviously
similar to windows
Show variables which are set:
env
Use a variable:
$VAR or ${VAR}
$SASHOME/sas
Set a variable:
VAR=123 local
export VAR=123 global
Append to a variable:
PATH=$PATH:/somewhere/else
Have a look at your
~/.bashrc
(the 'autoexec')
Spend some time on this. SAS uses them a lot.
Same as how you would use a variable in a command line.
old way: ping `hostname`
better way: ping $(hostname)
what's different:
ping $(hostname)
ping ${HOSTNAME}
The idea of bash programming
if [ -f /etc/bashrc ]; then . /etc/bashrc fi
Testing for file existence
for file in *.txt; do mv -v $file $file.old; done
Looping
What is a process
Listing processes
Forking, Foreground, Background, Pausing
Signalling (killing)(!)
A running program is considered a process
'Lives' as it executes, 'dies' as it terminates
Each living process has a Process ID (pid)
Each process has a user id (uid) and group id (gid).
These dictate a user permissions on that process and that process' permissions on the filesystem.
Processes have a parent process, and a parent process ID (ppid).
The kernel starts a process called init with pid of 1. Each other process is a descendant and rest of the branching is called the process tree.
Each process has its own working directory, initially inheriting that of its parent process.
Each process has its own environment, complete with environment variables, initially inheriting that of its parent process.
(all processes currently running)
(top but way better)
(full screen, updates constantly)
(return the process tree)
(the one and only)
(gets pid(s) of a given executable)
(signalling)
(sends signal to
specified process(es))
(sends signal to
process(es) matching a pattern)
(sends signal to all process matching the specified command)
this is
obviously an
important thing
Interrupt - stop. Sent by the kernel when Ctrl-C is pressed in a terminal.
Terminate - Please stop. Be graceful about it.
DIE! DIE NOW!
No you can't clean up after yourself.
Temporary Stop - Please stop temporarily. Sent with Ctrl-Z.
Forks runMe to background and returns job number
at command
no access to cron probably
You'll probably use LSF.
If you get this far you've gone way too fast.
stdin, stdout, stderr
compressing files
filtering output
piping through ssh