Karl Ho
Data Generation datageneration.io
Karl Ho
School of Economic, Political and Policy Sciences
University of Texas at Dallas
Implications for visualization: Every graph carries assumptions from sampling, measurement, and coding decisions.
Example: Census data vs. Twitter API data—each is shaped by institutional and technical design.
“Data do not exist independently of the ideas, instruments, practices, contexts, and knowledge used to generate them” (Kitchin 2014, The Data Revolution).
Visualization = Mapping data → aesthetic attributes → perceptual objects.
“A statistical graphic is a visual display that shows quantitative and categorical information” (Wilkinson 2005, p. 22).
Gary King et al. (2000): “Statistical analyses do not interpret themselves; interpretation is the substantive act of turning data into knowledge.”
Visualization can mislead if statistical uncertainty, distributions, or context are ignored.
Cleveland (1985) emphasizes exploratory data analysis (EDA) as a form of statistical thinking with graphics.
Colin Ware (2012): “Perceptual tasks are the foundation for graphical design: what people can and cannot perceive determines what graphics can and cannot do.”
Big data: Sensor logs, social media, administrative records.
Advantage = volume, real-time analysis. Limitation = bias, lack of representativeness.
Visualization challenge:
scalability of graphics (e.g., from scatterplots to heatmaps, from static charts to interactive dashboards).
Alberto Cairo (2016): “Big data is not necessarily better data. More is not always more.”
“Above all else, show the data.”
- Edward Tufte (2001)
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
individuals
Swedish physician and statistician
Founded Gapminder Foundation
Visualize historical data on public health and poverty
Chief Economist, Google
Professor of Economics, University of California, Berkeley.
Big Data: New Tricks for Econometrics
Machine Learning and Econometrics
The first thing is "it will do no harm". Visualized data must not obscure the findings or confuse the readers.
By Karl Ho