A Complete Frankenstein Variorum:

Bridging digital resources and sharing the theory of edition

DH 2024 Reinvention & Responsibility, Washington, DC
Panel: Unpacking the Past, Building the Future: Navigating the Complexities of Textual Analysis and Editions
8 August 2024, 8:30 - 10am, George Mason U.: Van Metre Hall 121

Link to these slides: https://bit.ly/fv-dh24

https://frankensteinvariorum.org

Elisa Beshero-Bondar	Raffaele Viglianti	Yuying Jin
@ebeshero	@raffazizzi	@yuying-jin

Objectives of the Frankenstein Variorum

to “upcycle” and connect previous digital editions of Frankenstein
to share a nonlinear, divergent edition history
to encourage exploration from one edition to the others

Variorum - for tracking and comparing versions

Most immediate context: Darwin Online (ed. Barbara Bordalejo), except...

Frankenstein Variorum compares five versions (to Darwin Online's six)
Frankenstein Variorum incorporates MS witnesses
Frankenstein Variorum integrates earlier digital editions made by others

FV as a Variorum

Visualizes a collation, or comparison of versions, working with digital editions that were encoded very differently
Designed as a static website for serendipitous browsing and intensive research
Applies the TEI in a JavaScript context to store comparison data and pointers to variant passages

James Rieger, ed., first new edition of 1818 in 141

years : inline collation of "Thomas" w/ 1818,

1831 variants in endnotes

Legend:

Stuart Curran and Jack Lynch: PA Electronic Edition (PAEE) , collation of 1818 and 1831: HTML

Nora Crook crit. ed of 1818, variants of "Thomas", 1823, and 1831 in endnotes (P&C MWS collected works)

Romantic Circles TEI conversion of PAEE ; separates the texts of 1818 and 1831; collation via Juxta

1974

~mid-1990s

1996

Charles Robinson, The Frankenstein Notebooks (Garland): print facsimile of 1816 ms drafts

2007

Shelley-Godwin Archive publishes diplomatic edition of 1816 ms drafts

print edition

digital edition

Legend:

2013

2017

Critical and Diplomatic Editions Leading to the Frankenstein Variorum Project

Frankenstein Variorum Project begins

assembly/proof-correcting of PAEE files; OCR/proof-correcting 1823; "bridge" TEI edition of S-GA notebook files; automated collation; incorporating "Thomas" copy text. Collation project completed in 2023, Variorum viewer officially launches in 2024.

New digital editions in the FV

New encoding of the 1823 edition, based on OCR from Google Books
New preparation of the "Thomas copy"

New digital edition of “1823”

1823: prepared by William Godwin, the first published edition bearing the name ”Mary Wollstonecraft Shelley” on the title page
Carnegie Mellon University librarians prepared OCR of the 1823 edition for our project
Our XML encoding for 1823 matches that of 1818 and 1831 editions (struc ture of letters, chapters, paragraphs, poems, annotations.)

New digital edition of “Thomas copy”

Our edition responds to James Rieger's and Nora Crook's print editions interpreting Mary Shelley's marginalia.
- prepared after EBB's personal consultation with the Huntington Library MS.
Added insertions, deletions, + margin-notes to the 1818 edition
Prepared new XML from 1818 edition, with <add>, <del>, <note> elements showing Thomas marginalia

Preparing for collation

Collating when the editions are so different (1)

Align and “chunk”

Best not to collate the entire novel files to prevent severe alignment errors!
We prepared 33 collation units (or "chunk files") sharing common starting and ending points.
Edition files of the same chunk are collated together

Collating when the editions are so different (2)

Prescribe rules to direct the machine-assisted collation

Extensive Python collation script
- to work around differences
  - (identify and unite words split around line-endings in S-GA)
- to identify what features can be ignored/skipped over for collation purposes
  - (e.g. markup of pagination, line-by-line encoding in S-GA)
- to normalize: identify what apparently different features are the same:
  - <milestone type='paragraph'> is same as <p>
  - "&" is not different from "and"
- Prescribe output in form of TEI critical apparatus :
  - coordinate information on which editions align and what normalized tokens/strings they share at this point.
  - (See Parallel Segmentation encoding in TEI Guidelines)

Markup of text structure compared across Variorum:
- Volume (print editions only), letter, chapter
- Paragraph, poetry line-groups and lines
- Notes

Markup of manuscript events included in Variorum comparison: deletion, insertion, gap

Normalizing algorithm:
- Decide what marks are equivalent)
- Ignore but preserve other markup in collation process, also abbreviations, capitalization.

Background image created by the author from a loom on Reddit and the frontispiece illustration of Frankenstein (1831)

CAUTION: Collation of heavily altered documents leads to many tangles and snags.

Completing this project was not possible without students!

Students helped with...

Exploring the development of contextual annotations (Stephen, Jack and Avery at CMU)
Tracking the kinds of errors we would find in the collation in our collationWorkspace (Nate and Rachel)
Finding algorithmic ways to debug collation tangles (Mia, Jackie, Nate, and Yuying)
Defining “long-tokens“ to pull heavily revised passages and long deletions away from the collation machinery! (ask us about this) (Yuying for the win!, with Nate and Rachel)
Developing and testing our shell-script to run our postCollation pipeline (Yuyin g)
Finalizing the Intnerface in React + Astro (Yuying's senior design project)
Roll credits: People page on the Variorum website

Collation projects take much longer to debug than you ever expected
Correct the input machinery, not the output.
- Minimize brittle hand-correction!
- Work on the pre-processing.
- Refine post-processing to correct output errors!
Machine-assisted processes need a lot of documentation
- for project sustainability
- for reproducibility of data

Look for ways to involve students!
- especially undergrads unfamiliar with the tech
- forces clear communication from everyone!
- best way to simplify overly complicated processes
- major skill building for all!

Spine and data coordination

From collation data to spine

“Spine” = data model (dynamic nerve plexus?) holding the variorum together
- standoff use of TEI critical apparatus
  - coordinates data on variance, including normalized tokens and maximum edit-distance values
- points to specific locations in the variorum edition files

How do the five editions “stack up” by collation chunk?

Legend

1818

Thm

1823

1831

gaps, alignments, relative string-length for each ”chunk”

Heatmap navigator for the Frankenstein Variorum

How did we make this?

Out of the "Spine" data!
XSLT => SVG

See our Method page for details.

And if you want to learn more about collation and text processing, check out this nifty
"Flattening and Raising" slideshow.

Interface and theory of edition

TEI and JS in the static site design

Selectors

Hot Spots

Variations

1823

Thomas

Publishing a typical TEI digital scholarly edition, today

Browser, UI & UX

Webapp logic

Database

Server

“not all projects should be maintained in perpetuity. Some are ... not worth the intellectual, technical, and financial overhead of ongoing maintenance.”
(Smithies et al. 2019)

Smithies, James, Carina Westling, Anna-Maria Sichani, Pam Mellen, and Arianna Ciula. 2019. “Managing 100 Digital Humanities Projects: Digital Scholarship & Archiving in King’s Digital Lab.” Digital Humanities Quarterly 013 (1).

Those in charge of infrastructure are also determining, particularly in the long term, the scholarly worth of a project, whether it should remain online, and in what form.

Less infrastructure: a static site approach from the start

Browser, UI & UX

Static site

generator

Server

Low infrastructure approach: inspiration

Endings Project

Minimal Computing

The Variorum's Front End Stack

A static website generator
Page routing across 146 chapters among five different editions
Astro-tei package to render TEI as HTML Custom Elements (CETEIcean)

Building interactive user interfaces
- Selectors
- Hotspots
- Variations
Astro-tei to directly connect to TEI elements

Summary of technologies at each stage of the FV project

Details on our Method page: https://frankensteinvariorum.org/method/

Link to these slides: https://bit.ly/fv-dh24

Dive in and explore (mobile friendly): https://frankensteinvariorum.org/

Completing the Frankenstein Variorum

By Elisa Beshero-Bondar

Completing the Frankenstein Variorum

Presentation for the DH2024 conference on the now complete Frankenstein Variorum project, with emphasis on theory of edition as expressed in the structure and interface.

Elisa Beshero-Bondar PRO

Professor of Digital Humanities and Chair of the Digital Media, Arts, and Technology Program at Penn State Erie, The Behrend College.

A Complete Frankenstein Variorum:

Bridging digital resources and sharing the theory of edition

Objectives of the Frankenstein Variorum

Variorum - for tracking and comparing versions

FV as a Variorum

Critical and Diplomatic Editions Leading to the Frankenstein Variorum Project

New digital editions in the FV

New digital edition of “1823”

New digital edition of “Thomas copy”

Preparing for collation

Collating when the editions are so different (1)

Collating when the editions are so different (2)

CAUTION: Collation of heavily altered documents leads to many tangles and snags.

Completing this project was not possible without students!

Students helped with...

Spine and data coordination

From collation data to spine

How do the five editions “stack up” by collation chunk?

Interface and theory of edition

TEI and JS in the static site design

Completing the Frankenstein Variorum

More from Elisa Beshero-Bondar