by Nathan LeClaire
You touch UNIX every day.
What is 'UNIX'? In the narrowest sense, it is a time-sharing operating system kernel: a program that controls the resources of a computer and allocates them among its users.
Users can:
In a broader sense, UNIX is often taken to include not only the kernel, but also essential programs like compilers, editors, command languages, programs for copying and printing files, and so on . . . UNIX may even include programs developed by you or other users to be run on your system . . .
the kernel is the main component of most computer operating systems; it is a bridge between applications and the actual data processing done at the hardware level. The kernel's responsibilities include managing the system's resources (the communication between hardware and software components)
- Wikipedia
Slow down there, Sparky.
It's rare that you will want to talk to the kernel directly.
You probably want something else. A higher level of abstraction, perhaps.
Something a bit more interactive.
Something you can type commands into, and see the output from those commands.
A shell is the program that interprets your requests to run programs. It manifests in the form of what most of us know as the "command line" or "terminal prompt".
By typing commands, you can see the output of running those programs, and interact with your computer. Let's look at some examples.
The simplest shell command is simply a word:
$ who
you tty2 Sep 28 07:51
jpl tty4 Sep 28 08:32
$ date; who
Wed Sep 28 09:07:15 EDT 1983
you tty2 Sep 28 07:51
jpl tty4 Sep 28 08:32
$ whoami
nathanl
$ ls
autojump Desktop Documents Downloads go gofun hello.go Music
Let's learn by example. You can use the cat command to print the contents of a file to the screen. It's shorthand for "concatenate", because you can print out the contents of multiple files this way, and cat sticks them all together.
$ cat hello_world.c
#include <stdio.h>
int main() {
printf("hello, world!");
return 0;
}
$ cat hello_world.c goodbye_world.c
#include <stdio.h>
int main() {
printf("hello, world!");
return 0;
}
#include <stdio.h>
int main() {
printf("goodbye, world!");
return 0;
}
$
Be careful about using this character with rm, the command to remove files!
$ cat *.c
#include <stdio.h>
int main() {
printf("hello, world!");
return 0;
}
#include <stdio.h>
int main() {
printf("goodbye, world!");
return 0;
}
$ rm *
$ echo "Whoops, I just deleted my project"
Whoops, I just deleted my project
$
Speaking of files, let's talk a bit about UNIX's filesystem model.
Files are arranged in a directory hierarchy that starts at /.
You can see which directory you are currently in with the pwd command.
$ pwd
/home/nathanl
The ls command will present you with a list of the files which reside in the directory you specify (by default, the current directory).
$ ls
autojump Desktop Documents Downloads go gofun hello.go Music
If you pass in arguments to UNIX commands that start with a dash ("-"), they represent optional "flags" that change the operation of the command slightly.
$ ls -a
. .. autojump .bashrc Dekstop Documents Downloads go gofun hello.go Music .vimrc
The filenames . and .. have a special meaning in UNIX. They mean "the current directory" and "the directory above this one", respesctively. Filenames that start with a dot are "hidden".
You can use the cd command to navigate to any directory in your filesystem.
POSSIBLE INTERVIEW QUESTION ALERT!
The tilde ~ character is an alias for your home directory (/home/yourusername), which is where you start out when you log in.
To learn more about any UNIX command, just type
$ man commandname
at the command prompt. You will be greeted with a manual page with exhaustive information about the command. Handy if you are without an Internet connection, or just need to review a few of the command line options.
Long ago, as the design of the Unix file system was being worked out,
the entries . and .. appeared, to make navigation easier. I'm not sure
but I believe .. went in during the Version 2 rewrite, when the file
system became hierarchical (it had a very different structure early on).
When one typed ls, however, these files appeared, so either Ken or
Dennis added a simple test to the program. It was in assembler then, but
the code in question was equivalent to something like this:
if (name[0] == '.') continue;
This statement was a little shorter than what it should have been, which is
if (strcmp(name, ".") == 0 || strcmp(name, "..") == 0) continue;
but hey, it was easy.
First, a bad precedent was set. A lot of other lazy programmers
introduced bugs by making the same simplification. Actual files
beginning with periods are often skipped when they should be counted.
Second,
and much worse, the idea of a "hidden" or "dot" file was created. As a
consequence, more lazy programmers started dropping files into
everyone's home directory . . .
. . .I'm pretty sure the concept of a hidden file was an unintended consequence. It was certainly a mistake.
How
many bugs and wasted CPU cycles and instances of human frustration (not
to mention bad design) have resulted from that one small shortcut about
40 years ago?
Keep that in mind next time you want to cut a corner in your code.
"The Unix philosophy emphasizes building short, simple, clear, modular, and extendable code that can be easily maintained and repurposed by developers other than its creators."
- http://en.wikipedia.org/wiki/Unix_philosophy
As an example, let's say you have a list of hundreds of names in random order in a text file. You need them sorted, a task which is maddening and prone to error when done by hand.
With UNIX, we can accomplish this easily.
Unix provides us with pipes which allow us to chain the output of a command into another command. To do so, you use the | operator.
You can direct the output of a command into a file instead of the terminal with the > operator, and read input from a file with the < operator (try visualizing the flow of data following the direction the arrow is pointing).
Also, the >> operator will append the output to the contents of a file instead of writing over them. Let's see this stuff in action.
$ cat names.txt
Bob
Suzy
Fred
Nathan
Anthony
Dignan
Sterling
$ cat names.txt | sort
Anothony
Bob
Dignan
Fred
Sterling
Suzy
$ cat names.txt | sort >sorted_names.txt
$ ls
names.txt sorted_names.txt
There's a very useful command built in to almost every UNIX-based operating system called grep.
Grep is a "Global Regular Expression Pattern" matcher.
It will find the lines of a file which match a certain pattern. This is insanely useful if you have to search for all instances of, say, FooBarWidgetFactory in your codebase.
$ cat debts.txt | wc -l
20000
$ cat debts.txt | grep "Johnny"
8/16 Johnny owes Maple $40
9/24 Nils owes Johnny $45
$ cat debts.txt | grep -n "Johnny"
2456:8/16 Johnny owes Maple $40
15689:9/24 Nils owes Johnny $45
bash stands for "Bourne Again Shell", and it's the shell that will most likely be running by default if you happen to find yourself at a terminal. There are other shells, such as zsh (which is pretty awesome), that offer slightly different features, but bash is extremely dominant.
bash uses emacs keyboard shortcuts out of the box. As all good Starcraft players know, hotkeys are absolutely essential to operational efficiency, so learning the shortcuts will serve you very well as time goes on!
<C-a> : Move to the front of the prompt
<C-e> : Move to the end of the prompt
<M-f> : Move forward by a word
<M-b> : Move backward by a word
<C-k> : Kill (delete) everything on the prompt after the cursor
You can have a program run in daemon mode (in the background) by appending an asterisk to the end of the invocation. This is very handy for, say, servers, which will block the prompt until you end execution of them with <C-c>.
$ python -m SimpleHTTPServer
Serving HTTP on 0.0.0.0 port 8000 ...
^C
$ python -m SimpleHTTPServer &
$ echo "Yay, I can continue executing commands at this prompt"
Appending double asterisks will allow you to run any number of commands in sequence.
$ cd ~/awesomescripts/dir/node/jsboilerstrap && ./boilerstrap.js
Running JS Boilerstrap...
!! will fill in as the last command used.
$ echo "I'm awesome"
I'm awesome
$ !!
I'm awesome
$ apt-get install quux
E: Could not open lock file /var/lib/dpkg/lock - open (13: Permission denied)
E: Unable to lock the administration directory (/var/lib/dpkg/), are you root?
$ sudo !!
Note: there is some controversy over the use of sudo !!. Make your own decisions, but remember that with great power comes great responsibility.
You can use the history command to see a numerical list of what you have typed into the prompt. Using grep on the output of this command is very useful, as you can invoke previously used commands by number (just put a ! in front of the number).
$ history
1802 apt-get install node
1803 sudo !!
1804 cd ~/some_long/directory/path/to/a/script && ./run_the_script.sh
1805 fortune
1806 grep -r FooBarWidgetFactory
$ !1804
1804 cd ~/some_long/directory/path/to/a/script && ./run_the_script.sh
sed is a stream editor. You can pipe in input and it will replace certain patterns with others, printing the result to the standard output on the terminal. Many people now use Python or Perl for tasks of this nature, but sed is still an extremely useful tool, and it's available out of the box on every UNIX under the sun.
$ echo "all your base are belong to us" | sed 's/us/me/g'
all your base are belong to me
sed can also edit files in place without echoing to the standard output.
Part of the true power of bash and one of the reasons why it's been so popular for so long is that it is highly scriptable. It is extremely useful to be able to have a list of shell commands that you can run over and over at "the push of a button" to automate otherwise repetitive developer or QA tasks such as data setup.
bash also provides a variety of programming language constructs to extend and enhance this functionality.
Let's walk through the construction of a simple script to demonstrate this utility.
On SoundCloud (a music sharing/listening website), each track can be identified by an eight digit number, which is utilized by their publicly accessible APIs.
I was curious to see if I could scrape the URL of a song posted to Soundcloud to quickly extract this ID for use with their API.
Scripts, be they bash or otherwise, usually start off with a declaration for the shell to use to identify which language they are written in, and consequently how to understand their instructions.
This declaration starts with a "shebang" (pronounced shuh - bang), composed of the characters # (the she) and ! (the bang) and followed by a path which indicates which language interpreter to use.
bash scripts' usually look like:
#!/bin/bash
#!/bin/bash URL_USE_HTTP=`echo $1 | sed 's/https/http/g'` SONGID=`curl -s $URL_USE_HTTP | egrep -o 'data-sc-track="[0-9]+"' \
| head -n 1 \
| egrep -o '[0-9]+'` echo $SONGID
You have access to most of the same styles of conditionals and loops in bash as you do in full-fledged programming languages.
#!/bin/bash
echo "This scripts checks the existence of the messages file."
echo "Checking..."
if [ -f /var/log/messages ]
then
echo "/var/log/messages exists."
fi
echo
echo "...done."
To run your very own script, you need to set its permissions to executable. This can be accomplished with
$ chmod +x my_script.sh
There are other ways to represent permissions that chmod understands as well. For example, you can set them numer
Then to run your script:
$ ./my_script.sh
See here:
http://thejh.net/misc/website-terminal-copy-paste
The "bad guys" can and will try to trick you into running code / exploits which you will be helpless to defend against since you told the system it was okay!
ruby -e "$(curl -fsSL https://raw.github.com/mxcl/homebrew/go)"
You're at risk when using an installation procedure such as the above, and you should get into the habit of typing things from the Internet into the terminal by hand for security.
Questions?