Biases in Data Visualizations

 

Nikhil Gopal

Definition of Bias

 

1. A systematic distortion of a statistical result due to a factor not allowed for in its derivation.

 

2. Prejudice in favor of or against one thing, person, or group compared with another, usually in a way considered to be unfair.

 

3. In some sports, such as lawn bowling, the irregular shape given to a ball.

Recognized Biases In Clinical Research

Interpretation Bias

Selection Bias

Publication Bias

Analysis Bias

Detection Bias

Exposure Bias

 

 

 

"I SPADE"  (mnemonic device)

why recognize biases?

We cannot eliminate biases completely, but we can attempt to control for them

How would biases manifest in data visualizations?

 

 

 

"—the data, visual representation, textual annotations, and interactivity—and how visualizations denote and connote phenomena with reference to unstated viewing conventions and codes."

Visualization Rhetoric: Framing Effects in Narrative Visualization

Jessica Hullman & Nicholas Diakopoulos

IEEE TRANSACTIONS ON VISUALIZATION AND COMPUTER GRAPHICS, VOL. 17, NO. 12, DECEMBER 2011

Data

 

Selecting which data to include and omit.

 

Examples: Excluding outliers.

Aggregating or summarizing data.

 

Reminescent of Analysis Bias

http://vudlab.com/simpsons/

http://vudlab.com/simpsons/

Visual Representation

 

How data dimensions are mapped to visual attributes.

 

Example: Continuous data loses resolution when mapped to grayscale

 

Reminiscent of Interpretation Bias (e.g. skewing retinal variables to support one perspective)

https://mycarta.files.wordpress.com/2011/11/render_compare_jet_detail.png

Annotation

 

Providing context via textual, graphical, or social means. Serves to focus reader attention.

 

Examples: Annotating a map with pertinent information. User comments under a blog post. Infographics!

 

 

Interactivity

 

Providing choices to a reader that constrain their exploration capabilities.

 

Example: An app with navigation views that allows a drill-down of only a certain class of data

But, wait a minute...some of these aren't exactly biases. They're design decisions!

 

But how do we differentiate?

Ask yourself:

 

Is this accurate?

Is this unfair?

Was this choice made for a good reason?

 

Also note: external constraints and limitations

The point is to provide a framework through which we can think about it!

Made with Slides.com