Learning Data Science
Lecture 7
Scientific Python
pandas
- fast
- powerful
- flexible
- easy to use
- open source
- data analysis tool
- data manipulation tool
Definition from the website

Everything we learned using pandas
- Series
- dtypes
- indexing
- Series operations
- Boolean Indexing
- DataFrames
- Inspecting DataFrames
- Reading/writing tabular data
- Indexing rows
- DataFrame boolean indexing
- Sorting DataFrames
- Stats of columns
- DataFrame Operations
- The magic apply() method
- Axes in DataFrames
- Aggregation
- Plotting from pandas
- Pandas and Seaborn
Anatomy of a Pandas DataFrame
CSV file
pd.Series
Geom
Alg
Trig
Calc
| 4.0 |
| 4.3 |
| 2.0 |
| 2.3 |
Take one column
Take one row
| 1.3 |
| 4.0 |
| 1.3 |
Susie
Jay
Lara
pd.Series
A ✨fancy numpy array, but with an index column
Take one column
Susie
Jay
Lara
pd.DataFrame
✨ Fancy Google Sheets or Excel but in Python
| Trig | Alg | Geom | Calc |
|---|---|---|---|
| 1.3 | 1.3 | 3.7 | 2.3 |
| 4.0 | 4.3 | 2.0 | 2.3 |
| 1.3 | 1.0 | 2.0 | 3.0 |
Open with pandas
Index
| Name | Trig | Alg | Geom | Calc |
|---|---|---|---|---|
| Lara | 1.3 | 1.3 | 3.7 | 2.3 |
| Jay | 4.0 | 4.3 | 2.0 | 2.3 |
| Susie | 1.3 | 1.0 | 2.0 | 3.0 |
Lecture 7
- Recap
- SQL Primer
- Using APIs
- Monte Carlo simulations
- Scientific Python
SQL and SQL Databases
SQL = Structured Query Language
What is it?
A programming language for managing data in a relational database.

Relational Databases
Tabular data stored in rows and columns
with
multiple interconnected tables
Susie
Jay
Lara
| Trig | Alg | Geom | Calc |
|---|---|---|---|
| 1.3 | 1.3 | 3.7 | 2.3 |
| 4.0 | 2.0 | 2.3 | |
| 1.3 | 1.0 | 3.0 |
Susie
Jay
Lara
| Last | Age | Uni | Cats |
|---|---|---|---|
| Jones | 22 | TUM | 0 |
| Sun | 23 | LMU | 6 |
| Blue | 25 | LMU | 1 |
Here they have the same Index
so we can take data from multiple tables
grades
students
Relational Databases
Susie
Jay
Lara
| Trig | Alg | Geom | Calc |
|---|---|---|---|
| 1.3 | 1.3 | 3.7 | 2.3 |
| 4.0 | 2.0 | 2.3 | |
| 1.3 | 1.0 | 3.0 |
Susie
Jay
Lara
| Last | ID | Uni | Cats |
|---|---|---|---|
| Jones | 45 | TUM | 0 |
| Sun | 48 | LMU | 6 |
| Blue | 66 | LMU | 1 |
grades
students
Susie
Jay
Lara
unis
LMU
Ulm
TUM
| City | Students | Courses |
|---|---|---|
| Munich | [48, 66] | [Trig, Alg] |
| Munich | [45] | [Geom, Calc] |
| Ulm | [] | [Trig, Alg, Calc] |
courses
Geo
Alg
Trig
| ID | Prof ID | Students |
|---|---|---|
| 1 | 44 | [45, 48, 66] |
| 2 | 154 | [45, 66] |
| 3 | 22 | [45, 48] |
Relational Databases
SQL: For when you have lots of data which interconnects in complex ways!
SQL Databases run the world!
Many database formats use SQL
The main trio:

- Web-dominant
- Huge ecosystem

- Great for local use
- Built into python
- No server
- Most large-scale applications
- Academics fave 💕
Let's look at a real-life example

Movie rental store 🍿
Let's look at a real-life example

Let's move into the notebook now:
[URL SOON]
Let's look at a real-life example

Download the dataset:
https://www.kaggle.com/api/v1/datasets/download/atanaskanev/sqlite-sakila-sample-database
Lecture 7
- Recap
- SQL Primer
- Using APIs
- Monte Carlo simulations
- Scientific Python
APIs
Application Programming Interface
Think of APIs like a waiter at a restaurant

“I can’t honestly recommend anything – I’ve watched them make the stuff.”
Roy Fox — May 1, 1954
APIs
Application Programming Interface
Think of APIs like a waiter at a restaurant:
- You get a list of things you can do
- You request them from an API
- Then the information you want gets delivered back

APIs
Application Programming Interface
APIs
Application Programming Interface
Most APIs are accessed with URLs:
http://reddit.com/api/r/python/new
Let's go over to the notebook and try using some APIs ourselves!
Challenge #2
locations = {
"Berlin": {
"latitude": 52.52,
"longitude": 13.41
},
"Paris": {
"latitude": 48.85,
"longitude": 2.35
},
"Rome": {
"latitude": 41.90,
"longitude": 12.48
}
}
Lecture 7
- Recap
- SQL Primer
- Using APIs
- Monte Carlo methods
- Scientific Python
Monte Carlo (MC) Methods
Started at Los Alamos as a way to model neutron diffusion
Needed a code name for this method 🕵️♂️

Monte Carlo (MC) Methods
Use random sampling to estimate a very complicated probability distribution
Simple example:
- We are playing a game with two dice
- To win, you need to roll two 6's on the first try
- What are the chances of winning?

Let's code it up!
Monte Carlo (MC) Methods
That was easy to calculate by hand... but what about this?

Let's code it up!
- You pick three random dice with a number of sides between 2 and 20
- To win you need to roll at least 30
- What are you chances of winning?
Challenge #2
That was easy to calculate by hand... but what about this?

You can use a for loop if it helps conceptually.
For an extra challenge, try and do it only with vectors 💪
- You pick three random dice with a number of sides between 2 and 20
- To win you need to roll at least 30
- What are you chances of winning?
MC in the real world
Imagine you want to determine the habitability or other properties of star systems. This depends on many things:
- the type of star
- the age of the star
- the number of planets in the system
- the mass of the planets
- the orbits of each planet
- etc
Given the observations we have of stars and exoplanets, we could reasonably estimate the distributions of each of these parameters
MC in the real world
Imagine you want to determine the habitability or other properties of star systems. This depends on many things:
- Many different attributes
Distributions of attribute from observations
Draw initial parameters and simulate many examples with MC
MC in the real world
Imagine you want to determine the habitability of star systems. This depends many things including:
- Many different attributes
Distributions of attribute from observations
Draw initial parameters and simulate many examples with MC

Lecture 7
- Recap
- SQL Primer
- Using APIs
- Monte Carlo simulations
- Scientific Python
scipy
Algorithms for:
- Optimization
- Integration
- Interpolation
- Linear Algebra
- Differential Equations
- and more
scipy
Most algorithms are written in:
- Fortran
- C
- C++
and then "wrapped" in Python.
install scipy
uv add scipy
Things we will try today:
-
scipy.constants
-
scipy.stats
-
scipy.integrate
-
scipy.interpolate
-
scipy.optimize
-
scipy.fft
Back to the notebook we go!
Lecture 7
- Recap
- SQL Primer
- Using APIs
- Monte Carlo simulations
- Scientific Python
The End
Learning Data Science Lecture 7
By astrojarred
