Measurement, Translation, and the Values of Computation 

A Perspective from the Digital Humanities

Link to these slides: http://bit.ly/dh-measure

Presented by Elisa Beshero-Bondar

Assoc. Professor of English and Director, Center for the Digital Text @ Pitt-Greensburg

Twitter: @epyllia | GitHub: @ebeshero

Before we begin...

  • This talk is motivated by recent personal experiences with touring some early computing machinery, and with being asked recently about my personal definition of the “Digital Humanities.”
     
  • This talk will introduce the Digital Humanities by first visiting some powerful computing mechanisms and seeing how they work.
     
  • I hope to highlight that computation isn't just about accuracy.
     
  • It's also about human factors; how computers expand our capacities to perceive and comprehend. Computers are extensions of our thinking, and can change how we learn and know things.
     
  • The haptic, tactile, physical dimensions of early computing might surprise us with their clarity and precision in a ”virtual” age of powerful micro-computing.

A visit to the USS Wisconsin

docked at the Hampton Roads Naval Museum, Norfolk, VA

P2C2E (Processes too complicated to explain)? (*)

(*) Credit: Salman Rushdie, Haroun and the Sea of Stories (1990)

Maybe not!

Analog computing: uses physical mechanical properties to model and solve problems

  • Video helps us visualize how to do arithmetic with interactive round pinion and two racks. 
  • Notice the input and output options: move the pinion or move the rack. 
  • Teaching us requires an algorithm (step by step process) and a visualization challenge: take the pieces apart and simplify them in two dimensions

 

  • 20th-c. battleship = gigantic analog computer system with interfaces everywhere to operate the ship.
  • Training videos? Visualizations? They program humans to operate the ship computers!
  • Pressing necessity in wartime for survival

 

“Military industrial complex”

Rapid development of analog and digital computing during WW II

The Colossus machine = first digital computer: developed to break the encryption code of German radio messages

 

Encryption via translation of inputs

Lorenz cipher machine, used by the Germans to encrypt messages sent over radio

Every letter of the alphabet is a combination of dots and blanks: 

 

  • Add a letter's pattern to a series of letters to encrypt a message.
  • Dots and blanks = 1's and 0's of binary code
  • Translation from one medium to another  

Cracking the code with punched tape

Papers with holes punched = input

Decryption discovers the encryption key and releases the original message

Data input format with 19c origins in

  • Jacquard weaving loom patterns
  • Telegraph signals
  • Computer operators might just "know" this code by sight 
  • Without display screens or disk storage, punched paper tape was data transfer!

People get really good at calculations when working a simple computational interface

A:  Translation of symbolic objects into physical forms (analog computing here), manipulated by digits, comprehended by brains.

Q: How do you "do math" with an abacus?

Ancient and ongoing computing technology 

Braille monitor: a touch interface for digital computing

Digital Humanities (“DH”) is interested in. . .

  • Learning about humans in the ways they interact with computing devices
  • Learning about how cultures intersect and collide in computing contexts 
  • Archiving old computing systems (in curating paper tape)
  • Finding ways to translate from old media formats
  • Investigating how data about people and cultures is collected, categorized, applied
  • Collecting, categorizing, and applying data about humans 
  • Finding other ways of knowing besides what we take for granted to be ”true” and ”universal”.

Behrend's DIGIT major: Home of DH at Penn State! 

Some moving experiences of “DH”

all about translating from one way of knowing to another

 

Example one: a blind student who needs to analyze a poem for class. Can she study the poem in the same way as the other students?

What is the best option?

  • Screen-reader software (NVDA, or JAWS)
  • Human readers (Learning Ally, or National Library Service for the Blind and Print Disabled) 
  • A plain text file with line breaks clearly signaled

Example two: An issue with international standards for encoding measurements

This involved the TEI community...

What is the TEI?

Not any of these TEIs. . .

  • TEI Total Economic Impact
  • TEI Teacher Education Institution
  • TEI Tax Executives Institute, Inc.
  • TEI Terminal Equipment Identifier
  • TEI Terminal Endpoint Identifier
  • TEI Thailand Environment Institute
  • TEI Technological Educational Institute (Greece)
  • TEI Tertiary Education Institute (various locations)
  • TEI Thermal Engineering International (US and UK)
  • TEI Total Estimated Investment (various locations)
  • TEI Trans-Earth Injection
  • TEI Thermo Environmental Instruments

An international community

  • A set of shared Guidelines for encoding machine-readable texts
  • Fields: humanities and social sciences
  • Originates in 1987, formalized by 1994
  • Founding priority: Guidelines for Text Encoding and Interchange  (another possible meaning for "i" in the TEI) 
We use TEI

Interoperation: Can texts prepared for machine processing in one computer system be understood by others?

Interchange: Can the machine-readable parts of the texts be understood by humans, who can work with them as needed, without additional information?

  • Big community around the world; features an annual conference and an academic journal, as well as the TEI Guidelines for text encoding. 
  • Tags are written by English speakers: Is this is a problem for coders who don't know English?
    • Acceptance around the world of English as the lingua franca for digital communities
    • Non-English speaking communities request and contribute explanations of the tags in their own languages...
    • ...but nearly always want to just use the same tag names as everyone else. English by default.
  • Lately: new emphasis on internationalizing the Guidelines
  • Lately: new interest in encoding strategies for vertical and right-to-left languages

On the topic of international standards...

There's an organization for making international  standards, connected to machinery we rely on to be interoperational

How do we define a measurement of time? 

ISO attempts to set precise standards based on measurable physical properties of our universe. According to ISO, a "second" in time is defined thus:

The second is the duration of 9,192,631,770 periods of the radiation corresponding to the transition between the two hyperfine levels of the ground state of the caesium-133 atom.

Contemporary nuclear science prevails in this measurement. Is it applicable to explanations of time duration from past centuries? 

A nuclear physics lab can consistently measure the passage of one second from observing subatomic particles. This is more precise than watching a spring-and-weight driven watch or pendulum clock.

A problem with coding measurements in the TEI

  • Before 2017, the TEI Guidelines examples of coding units of measure pointed to translation of anything to a defined, universal ISO standard.  
     
  • Naoki Kokaze, a graduate student from University of Tokyo, addressed the TEI Conference in 2017 with a poster presentation
  • His project required documenting now-obsolete local Japanese units of measure, from particular regions and villages, and decoding them in relation to other local systems.
  • Naoki asked the TEI for a new outlook: not to privilege current systems of measure, because the TEI should permit encoding of past knowledge systems associated with historic documents.
     
  • Could the TEI create new data structures to allow for defining nonstandard units of measure?
     
  • That would allow processing, and calculating equivalences between different nonstandard measuring systems.

 

  • The TEI Technical Council worked together with Naoki on a new data structure.
     
  • As of July 2019, TEI has new capacity to express nonstandard historical and local measuring systems
     
  • No longer do we assume that units of measurement rely only on ISO definitions.
     
  •  TEI encoders now have examples of how to define and work with nonstandard measurement systems, as well as standard ones.

Standard weights and measures from ancient Egypt

4 digits = a palm

Digit-al Humanities 

Visualizing New Kingdom units of measure (1500 - 1000 BCE) as they relate to one another 

Student DH projects

Example 1
Exploring Textual Variation

 

  • Round nodes represent each version
  • The size of the node indicates how much it shares in common with other versions. The least common versions look the smallest (and these turn out not to contain all of the poems) 
  • Thickness of the connecting lines represents how much one edition shares with others
  • See http://dickinson.newtfire.org/16/networkAnalysis.html 

Student DH projects

Example 2:
Celebrating counter-culture with data

Digital Humanities ”commons” values:

open source, open access, open sharing

Banksy, the stunt artist defying property owners: Here, the partial shredding of ”Girl with Balloon.”

Did Banksy free this artwork from its ”owners” at Sotheby's auction house? Or create something new to shock the auctioneers?

Banksy as a student DH project

applies computation to measure how widely distributed the phenomenon of Banksy artwork is in the world.

Which years produced more graffiti?

  <bibl>
      <title>Choose Your Weapon</title>
      <alternate>Haring Dog</alternate>
      <date when="2002"/>
      <medium type="spray_paint"/>
      <location lat="51.4986" long="-0.0757">London, UK</location>
      <size orientation="landscape">large</size>
      <ref target="http://www.banksy.co.uk/">Banksy's Personal Site</ref>
  </bibl>

Marked up metadata about the artwork

Network graphic plotting which countries contain Banksy art installations, with graffiti works in red 

Shown: UK, USA, France, Iceland, Mexico, Canada, Australia, Palestine, Mali

Both student projects:

  • rely on the same system of markup (text encoding)
  • count and measure a phenomenon of interest 
  • visualize it in a way to analyze and comprehend something difficult 

 

What I’ve been learning about computation and DH:

  • How we measure depends on what cultural systems we need to work with.
  • Data curation and "visualization" is an act of translation from one format to another (whether visual or tactile or auditory...)
  • A “visualization” is something we need to communicate to people who need a way to see patterns in a data stream.
  • Thinking about how to share and visualize data is also thinking about how we read, recognize, and understand patterns.

Thank you for listening! For more on my students' projects and my courses, please visit  https://newtfire.org

”Minute Exercise”: Jot down three things worth remembering from this talk. 

Are there any questions?