columbia university

computing fundamentals w/ python

 

👑 💻 🐍

command flag(s) argument(s)

e.g., ls -F Desktop

is like verb adverb(s) object(s)

command syntax

  • the command has to come first
  • there can only be one command per line
  • a command can have zero or more flags
  • flags are preceded by one or two hyphens (e.g., -flag or --flag)
  • a command can have zero or more arguments

commands

  • variables are names for values
  • in python the "=" symbol assigns the value on the right to the name on the left, e.g.,               my_variable = "my variable's value"
  • the variable is created when a value is assigned to it
  • variable names
    • can only contain letters, digits, and underscores
    • cannot start with a digit
    • should be descriptive

variables

  • int: an integer, e.g., 52 or -52
  • string: a set of characters; specified within single or double quotes, e.g., 'i am a string' or "my_string"
  • float: a floating-point number; specified with a decimal point, e.g., 52.001 or -52.001

 

... plus more we'll get to later

data types

  • functions (also called methods) are predefined actions
  • like variables, each function has a name
  • unlike variables, functions can take arguments (also called parameters)
  • you can call a function by invoking it by name and passing it zero or more arguments in parentheses, e.g., len('my_string')

functions

  • python libraries are self-contained groups of python modules that provide additional functionality to the main python language.
  • many libraries are open source, meaning you can freely use, look at, and repurpose them. 
  • our anaconda environment will help us import python libraries into our python programs, allowing us to use the libraries' classes, functions, and variables in our notebooks transparently

libraries

  • tabular data is information in table form
  • proprietary programs like excel and numbers create table-like data, but they frequently include styling and extra metadata that can confuse and bog down programs
  • it is best to use open, plaintext formats for your data, including tabular data.
  • the best examples are CSV and TSV files, which stand for Comma-Separated Value and Tab-Separated Value, respectively.

tabular data

  • use DataFrame.iloc() to select data by index position
  • use DataFrame.loc() to select data by (column/row) label
  • when in doubt, use help(data.loc) and help(data.iloc) to check

selecting data in data frames

Made with Slides.com