digital humanities technology specialist @ nyu it & libraries
command flag(s) argument(s)
e.g., ls -F Desktop
is like verb adverb(s) object(s)
- the command has to come first
- there can only be one command per line
- a command can have zero or more flags
- flags are preceded by one or two hyphens (e.g., -flag or --flag)
- a command can have zero or more arguments
- variables are names for values
- in python the "=" symbol assigns the value on the right to the name on the left, e.g., my_variable = "my variable's value"
- the variable is created when a value is assigned to it
- variable names
- can only contain letters, digits, and underscores
- cannot start with a digit
- should be descriptive
- int: an integer, e.g., 52 or -52
- string: a set of characters; specified within single or double quotes, e.g., 'i am a string' or "my_string"
- float: a floating-point number; specified with a decimal point, e.g., 52.001 or -52.001
... plus more we'll get to later
- functions (also called methods) are predefined actions
- like variables, each function has a name
- unlike variables, functions can take arguments (also called parameters)
- you can call a function by invoking it by name and passing it zero or more arguments in parentheses, e.g., len('my_string')
- python libraries are self-contained groups of python modules that provide additional functionality to the main python language.
- many libraries are open source, meaning you can freely use, look at, and repurpose them.
- our anaconda environment will help us import python libraries into our python programs, allowing us to use the libraries' classes, functions, and variables in our notebooks transparently
- tabular data is information in table form
- proprietary programs like excel and numbers create table-like data, but they frequently include styling and extra metadata that can confuse and bog down programs
- it is best to use open, plaintext formats for your data, including tabular data.
- the best examples are CSV and TSV files, which stand for Comma-Separated Value and Tab-Separated Value, respectively.
- use DataFrame.iloc() to select data by index position
- use DataFrame.loc() to select data by (column/row) label
- when in doubt, use help(data.loc) and help(data.iloc) to check
selecting data in data frames
Columbia University: Computing Fundamentals with Python