Statistics studies "real" differences and relationships

πŸ“ˆ Sample Data & Inference

  • Use of a random sample (e.g., heights in Ireland) to infer population trends
  • Core Question: Are observed differences real or due to chance?
  • Key Quote: β€œWe can’t measure the height and weight of the entire population...”

πŸ”’ Types of Variables

  • Categorical: Divides data into groups (gender, age group)
  • Numeric: Represents measurable quantities (height, weight)
  • Guides choice of appropriate statistical methods

πŸ“Š Summarizing & Visualizing Data

  • Transforms raw data into meaningful insights
  • Categorical Data: Counts, bar charts
  • Numeric Data: Range, median, mean, box plots, histograms

πŸ”— Analyzing Combinations of Variables

  • Categorical + Numeric: Compare group means (e.g., men vs. women)
  • Two Numeric: Look for correlation (e.g., height vs. weight)
  • Two Categorical: Examine proportions across groups (e.g., gender vs. age group)

πŸ”¬ Statistical Tests & P-values

  • Use statistical tests to determine if sample observations likely represent true population trends
  • P-value: Probability of observing results if no effect exists
  • Alpha Value (commonly 0.05): Decision threshold to reject or fail to reject the null hypothesis

πŸš€ Common Statistical Tests

  • One Sample Proportion Test: Checks if a single categorical proportion differs from a known or hypothesized value
  • Chi-square Test: Examines relationships between two categorical variables (e.g., gender vs. age group)
  • T-test: Compares mean(s) of numeric data; can be for one sample mean vs. a known value or two groups (e.g., men vs. women)
  • ANOVA (Analysis of Variance): Compares means across more than two groups/categories
  • Correlation Test: Assesses the strength and direction of a relationship between two numeric variables

πŸ’‘ Formulating Hypotheses

  • Null Hypothesis (Hβ‚€): No effect or relationship
  • Alternative Hypothesis (H₁): Effect or relationship present
  • Avoid β€œdata mining” by defining research questions in advance

πŸŒ€ Correlation Coefficient

  • Ranges from -1 to +1
  • 0 indicates no linear relationship; Β±1 indicates perfect linearity
  • Quantifies how two numeric variables move together

βœ… Conclusion

  • Statistics as a practical tool for exploring data and making inferences
  • Key Takeaway: Understand variables, pose clear hypotheses, use proper tests
  • β€œIt’s not good science to...randomly stab around...hoping to find something statistically significant.”
Made with Slides.com