Interactive and Persuasive: Collections 

  • How do we visualize our results?
  • What are meaningful visualizations and how can you integrate them into your narrative?
  • Do you publish digitally or print?

Today's Flow

Kirilloff & Varela

~Theories on data~

 

-Case Studies:

--Perspective

--Literary addresses

 

-Methodologies: data as context & a data-assisted

1.

2.

Rettberg

Failed Predictions in machine learning as method

 

-Case Study: Predicting Character Interactions Through Verbs

3.

Schmidt

An example of visualization and digital publishing

 

-How do we approach it informed by the other readings?

# CHAPTER 2

Kirilloff - Google's Perspective

-An API intended flag "toxic" content.

-Toxic defined by what "will likely make others to leave the conversation"

 

-Corpus its trained on: 160,000 Wikipedia comments and annotaters

-Machine learning learns what is perceived as toxic

-Typically flags any statements about disenfranchised groups as toxic

 

-Importance for Kirilloff: "An uncomfortable reflection of the humans that created and labeled its training data"

& "need to render [systemic bias and cultural norms] visible"

Kirilloff - study on literary addresses

-Corpus: Project Gutenberg and Chicago and Chadwyck corpus (footnote15). All anglophone texts, 1782-1923.

 

-Addresses are outside of dialogue, found with key words/phrases (12)

-Iterated over search process to only get true positives

-Locate unusual address use cases: James Weldon Johnson & William Wells Brown vs. Frank J. Webb (14-15).

 

-Importance for Kirilloff: Demonstrate method for information/context creation from computation to add alongside of biographical and archival context 

1

Close Reading the Data:

-data as text (14)

-results should be regarded as a hypothesis (19)

2

Close Reading With Data:

-data as context (14)

-opens computational work up to more
conventional forms of literary analysis, may be useful to a wider
scholarly audience. (14)

3

Iterative process of Close Reading with Data:

-Contrasted with iterative process of reading for better algorithmic success (13)

5

Separating Labor of creating data-context with that of interpretation:

-Archives as an example of this division of labor

4

"Broken" tech as refraction to further investigate culture:

-Google's Perspective

-Reverse Engineer approach

6

"Affect and Data... data can manipulate our emotions: 'graphical tools are a kind of intellectual Trojan horse, a vehicle through which assumptions about what constitutes information swarm with potent force (Drucker)" (9)

Kirilloff Reading:

1

Data Driven Methodologies: use computers to reason under formal constraints. Knowledge is conceived as a rational project, where logic and formal representations are used in the pursuit of replicable conclusions that disprove previous ideas.

2

Data Assisted Approach: use computers to “imagine in different ways” (Harrell). Data opens speculative and subjective avenues for interpretation

3

Thick Data & Data Biographies: how and why the data was collected, and how it was processed. I use the term thick data following Tricia Wang (2016), who extends the anthropological concept of thick description to data (13)

5

Critical Realism from Philosophy of Science:

-Can be realist about some thing and a constructivist about others (10)

4

Epistemologies:

-Realist (data driven) v constructivist (data assisted)

6

Method vs. Methodology:

-methodologies rather than methods. A methodology is a framework for estimating the pertinence of given methods and for articulating criteria for evaluating results as useful, appropriate, or correct. A method is a protocol, a series of steps. (9)

Varela Reading:

Kiriloff x Varela:

Term Definition
Close reading data: data as the object of reading, data as text to make better sense of their results
Close reading the text with the data: data as context, highlights ways shared features do not indicate shared meaning
Data driven: brackets off assumptions to find patterns
Data assisted:  uses data to problematize assumptions

Rettberg - Predict Character Interactions Through Verbs (pg2)

-Dataset from Database of Machine Vision in Art, Games, and Narratives

-747 verbs describing an interaction between fiction characters

-Verbs associated with info about the traits (gender, species, race/ethnicity, age, and sexuality) of fiction characters

 

-Machine learning to predict if a verb is active or passive based on traits of character who performed act

-Accuracy 56%

-False Passive example: attacking shadows as futile (3)

1

Drawing from Munk et al's methodology of failed predictions:

-Lineage of Cultural anthropology (Clifford Geertz)

2

Building on this methodology with simpler tech:

-R

3

Mispredictions (failure) of algorithms suggest rich areas to investigate qualitatively.

5

"The failed predictions of machines let us use machine learning as a collaborator" vs. "human-in-the-loop approach" (4)

4

"Not to replace human interpretation, but to sort through vast amounts of data to find the most worthwhile cases for interpretation (2)

6

"need to develop methodologies for using machine learning that build upon the epistemologies that are specific to the humanities and social sciences, and that support interpretation, uncertainty and detail." (5)

Rettberg Reading:

How might data as context and data-assisted methodology relate to Rettberg's "machine learning as a collaborator"?

1

Exploration of what's in HathiTrust:

-The visualization here provides a new way of exploring this vast digital library using a new method that makes a visual arrangement of books possible based on the vocabulary they use

2

Are these visualizations helpful?

3

Is it a website effective for this write up?

5

"The point is, rather, that the ways computer classifies can sometimes reflect reality more sensitively than a rigid set of rules. Computers can be more flexible than bureaucracies."

4

Return to Krilloff's Affect and Trojan Horse Comments?

6

How does this fit project fit in with Kiriloff & Varela?

Schmidt Example:

  • How do we visualize our results?
  • What are meaningful visualizations and how can you integrate them into your narrative?
  • Do you publish digitally or print?

Return to Core Questions:

Code

By Dav S-L

Code

  • 124