Statistics studies "real" differences and relationships
📈 Sample Data & Inference
- Use of a random sample (e.g., heights in Ireland) to infer population trends
- Core Question: Are observed differences real or due to chance?
- Key Quote: “We can’t measure the height and weight of the entire population...”
🔢 Types of Variables
- Categorical: Divides data into groups (gender, age group)
- Numeric: Represents measurable quantities (height, weight)
- Guides choice of appropriate statistical methods
📊 Summarizing & Visualizing Data
- Transforms raw data into meaningful insights
- Categorical Data: Counts, bar charts
- Numeric Data: Range, median, mean, box plots, histograms
🔗 Analyzing Combinations of Variables
- Categorical + Numeric: Compare group means (e.g., men vs. women)
- Two Numeric: Look for correlation (e.g., height vs. weight)
- Two Categorical: Examine proportions across groups (e.g., gender vs. age group)
🔬 Statistical Tests & P-values
- Use statistical tests to determine if sample observations likely represent true population trends
- P-value: Probability of observing results if no effect exists
- Alpha Value (commonly 0.05): Decision threshold to reject or fail to reject the null hypothesis
🚀 Common Statistical Tests
- One Sample Proportion Test: Checks if a single categorical proportion differs from a known or hypothesized value
- Chi-square Test: Examines relationships between two categorical variables (e.g., gender vs. age group)
- T-test: Compares mean(s) of numeric data; can be for one sample mean vs. a known value or two groups (e.g., men vs. women)
- ANOVA (Analysis of Variance): Compares means across more than two groups/categories
- Correlation Test: Assesses the strength and direction of a relationship between two numeric variables
💡 Formulating Hypotheses
- Null Hypothesis (H₀): No effect or relationship
- Alternative Hypothesis (H₁): Effect or relationship present
- Avoid “data mining” by defining research questions in advance
🌀 Correlation Coefficient
- Ranges from -1 to +1
- 0 indicates no linear relationship; ±1 indicates perfect linearity
- Quantifies how two numeric variables move together
✅ Conclusion
- Statistics as a practical tool for exploring data and making inferences
- Key Takeaway: Understand variables, pose clear hypotheses, use proper tests
- “It’s not good science to...randomly stab around...hoping to find something statistically significant.”
Statistics is about "real" differences and relationships
By Carlos Mendez
Statistics is about "real" differences and relationships
- 65