Reading and Exploring Five Versions of Frankenstein Digitally

NWPA Humanities & Social Sciences Conference 2025
28 March 2025 @ Gannon University, Erie PA

Link to these slides: https://bit.ly/fv-explore

Elisa Beshero-Bondar Raffaele Viglianti Yuying Jin
@ebeshero @raffazizzi @yuying-jin
  • Frankenstein Variorum (say it 3 times fast!)
  • Lets you explore 5 distinct versions of the novel Frankenstein

1. Try clicking anywhere in the heatmap to take you to a revised location

 

2. Now, check out how it looks in each other version

 

3. Runs on your local computer / device with JavaScript in your web browser

Objectives of the Frankenstein Variorum

 

  • to “upcycle” and connect previous digital editions of Frankenstein 
     
  • to share a nonlinear, divergent edition history
     
  • to encourage exploration from one edition to the others

 

 

FV as a Variorum

  • Visualizes a collation, or comparison of versions, working with digital editions that were encoded very differently
     
  • Designed as a static website for serendipitous browsing and intensive research
     
  • digital scholarly edition resource: stores comparison data and pointers to variant passages

  James Rieger, ed., first new edition of 1818 in 141

  years :   inline collation of "Thomas" w/ 1818,

  1831 variants in endnotes

Legend:

Stuart Curran and Jack Lynch: PA Electronic Edition (PAEE) , collation of 1818 and 1831: HTML

Nora Crook crit. ed of 1818,  variants of "Thomas",   1823, and 1831 in endnotes (P&C MWS collected works)

Romantic Circles TEI conversion of PAEE ; separates the texts of 1818 and 1831; collation via Juxta

1974

~mid-1990s

1996

Charles Robinson, The Frankenstein Notebooks (Garland): print facsimile of 1816 ms drafts

2007

Shelley-Godwin Archive publishes diplomatic edition of 1816 ms drafts

print edition

digital edition

Legend:

2013

2017

Critical and Diplomatic Editions Leading to the Frankenstein Variorum Project

Frankenstein Variorum Project begins 

assembly/proof-correcting of PAEE files; OCR/proof-correcting 1823; "bridge" TEI edition of S-GA notebook files; automated collation; incorporating "Thomas" copy text. Collation project completed in 2023, Variorum viewer officially launches in 2024.

FV includes PA Electronic Edition (mid 1990s): 1818 vs 1831

  • started from base HTML 1.0 files
  • up-converted to clean, simple XML
    • ”on its way” to TEI (structural elements in text)
    • prepared for machine-assisted collation (via collateX): including element tags
    • deep hierarchy of novel ”flattened” to milestones: <div type="volume"/>, <p/>, etc.
  • corrected against photofacsimiles of 1818 and 1831 print publications

FV includes Shelley-Godwin Archive encoding

  • S-GA diplomatic edition of the 1816 Notebooks,
    • encoded surface-by-surface, line-by-line
    • required resequencing to collate
       

Shelley-Godwin Archive: sample page surface:

New digital editions

in the Frankenstein Variorum

FV: new edition of “Thomas copy”

 

  • Represents EBB's personal consultation with the Huntington Library MS.
     
  • Consulted James Rieger's and Nora Crook's print editions interpreting Mary Shelley's marginalia.
     
  • Added insertions, deletions, + margin-notes to the 1818 edition
     
  • Prepared new markup from 1818 edition, with <add>, <del>, <note> elements showing Thomas marginalia

FV: New 1823 edition

 

  • 1823: prepared by William Godwin, the first published edition bearing the name ”Mary Wollstonecraft Shelley” on the title page
     
  • Carnegie Mellon University librarians prepared OCR of the 1823 edition for our project
     
  • Our encoding for 1823 matches that of 1818 and 1831 editions (structure of letters, chapters, paragraphs, poems, annotations.)

Preparing for collation

Collating when the editions are so different (1)

Align and “chunk”

  • Best not to collate the entire novel files to prevent severe alignment errors!
  • We prepared 33 collation units (or "chunk files") sharing common starting and ending points.  
  • Edition files of the same chunk are collated together

Collating when the editions are so different (2)

Prescribe rules to direct the machine-assisted collation
 

  • Our Python collation script 
    • works with collateX library, extensively customized
    • Prepares collateX to work around markup differences
      •   (identify and unite words split around line-endings in S-GA)
    • to identify what features can be ignored/skipped over for collation purposes
      • (e.g. markup of pagination, line-by-line encoding in S-GA)
    • to normalize: identify what apparently different features are the same:
      • <milestone type='paragraph'> is same as <p>
      •  "&" is not different from "and"  
    •  Outputs an XML data structure (TEI critical apparatus) :
  • Markup of text structures compared across Variorum:  
    • Volume (print editions only), letter, chapter
    • Paragraph, poetry line-groups and lines
    • Notes
  • Markup of manuscript events included in Variorum: Deletions, Insertions, Gaps
  • Normalizing algorithm:
    • Decide what marks are equivalent)
    • Ignore but preserve other markup in collation process, also abbreviations, capitalization.  

Normalized strings to compare

MS (from Shelley-Godwin Archive):

It was on a dreary night of November that I beheld 
&lt;del&gt;the frame on whic&lt;/del&gt; my man 
comple&lt;del&gt;at&lt;/del&gt;teed

1818 (from PA Electronic edition)

&lt;p&gt;IT was on a dreary 
night of November, that I beheld
the accomplishment of my toils.&lt;/p&gt;

Including markup in the comparison

Manuscript (from Shelley-Godwin Archive):

<lb n="c56-0045__main__2"/>It was on a dreary night of November 
<lb n="c56-0045__main__3"/>that I beheld <del rend="strikethrough" 
xml:id="c56-0045__main__d5e9572">
       <add hand="#pbs" place="superlinear" xml:id="c56-0045__main__d5e9574">the frame on
         whic</add></del> my man comple<del>at</del>
<add place="intralinear" xml:id="c56-0045__main__d5e9582">te</add>
<add xml:id="c56-0045__main__d5e9585">ed</add>

1818 (from PA Electronic edition)

<p xml:id="novel1_letter4_chapter4_div4_div4_p1">I<hi>T</hi> was on a dreary 
night of November, that I beheld the accomplishment of my toils.</p>
  • What matters for meaningful comparison?
    • Text nodes
    • <del> and <p> markup
  • What doesn't matter?
    • <lb/> elements, attribute nodes
    • <hi>? *In real life we include the <hi> elements as meaningful markup because sometimes they are meaningful for emphasis.

Tokenize them!

MS (from Shelley-Godwin Archive):

["It", "was", "on", "a", "dreary", 
"night", "of". "November", "that", 
"I", "beheld" 
"&lt;del&gt;the frame on whic&lt;/del&gt;",
"my", "man", 
"comple", "&lt;del&gt;at&lt;/del&gt;", "teed"]

1818 (from PA Electronic edition)

["&lt;p&gt;", "IT", "was", "on", "a", "dreary", 
"night", "of", "November,", "that", "I", "beheld",
"the", "accomplishment", "of", "my", "toils.", "&lt;/p&gt;"]

Project decision: Treat a deletion as a complete and indivisible event:

a ”long token”. This helps to align other witnesses around it.

Nodes on the other side of collation

Real output from the project

(Embedded markup is a little more complicated than our previous example)

<app>
	<rdgGrp n="['that', 'i', 'beheld']">
		<rdg wit="f1818">that I beheld</rdg>
		<rdg wit="f1823">that I beheld</rdg>
		<rdg wit="fThomas">that I beheld</rdg>
		<rdg wit="f1831">that I beheld</rdg>
		<rdg wit="fMS">&lt;lb n="c56-0045__main__3"/&gt;that I beheld</rdg>
	</rdgGrp>
</app>
<app>
	<rdgGrp n="['&lt;del&gt; the frame on whic&lt;/del&gt;',
               'my', 'man', 'comple', 
               '', '&lt;mdel&gt;at&lt;/mdel&gt;', 'te', 'ed', 
               ',', '.', '&lt;del&gt;and&lt;/del&gt;']">
		<rdg wit="fMS">&lt;del rend="strikethrough" 
          xml:id="c56-0045__main__d5e9572"&gt;
			&lt;sga-add hand="#pbs" place="superlinear" 
          sID="c56-0045__main__d5e9574"/&gt;the
	      frame on whic &lt;sga-add eID="c56-0045__main__d5e9574"/&gt; &lt;/del&gt; my man
		  comple &lt;mod sID="c56-0045__main__d5e9578"/&gt; 
          &lt;mdel&gt;at&lt;/mdel&gt;
		  &lt;sga-add place="intralinear" sID="c56-0045__main__d5e9582"/&gt;te
          &lt;sga-add eID="c56-0045__main__d5e9582"/&gt;
          &lt;sga-add sID="c56-0045__main__d5e9585"/&gt;ed
		  &lt;sga-add eID="c56-0045__main__d5e9585"/&gt;
          &lt;mod eID="c56-0045__main__d5e9578"/&gt;
          &lt;sga-add hand="#pbs" place="intralinear"sID="c56-0045__main__d5e9588"/&gt;, 
          &lt;sga-add eID="c56-0045__main__d5e9588"/&gt;.
		  &lt;del rend="strikethrough"
		  xml:id="c56-0045__main__d5e9591"&gt;And&lt;/del&gt;</rdg>
	</rdgGrp>
	<rdgGrp n="['the', 'accomplishment', 'of', 'my', 'toils.']">
		<rdg wit="f1818">the accomplishment of my toils.</rdg>
		<rdg wit="f1823">the accomplishment of my toils.</rdg>
		<rdg wit="fThomas">the accomplishment of my toils.</rdg>
		<rdg wit="f1831">the accomplishment of my toils.</rdg>
	</rdgGrp>
</app>

Background image created by the author from a loom on Reddit and the frontispiece illustration of Frankenstein (1831)

CAUTION: Collation of heavily altered documents leads to many tangles and snags.

Completing this project was not possible without students!

Students led the way!

  • Exploring the development of contextual annotations (Stephen, Jack and Avery at CMU)
     
  • Help track the kinds of errors we would find in the collation in our collationWorkspace (Nate and Rachel)
     
  • Find algorithmic ways to debug collation tangles (Mia, Jackie, Yuying, Nate, and Rachel)
     
  • Create “long-tokens“ to pull heavily revised passages and long deletions away from the collation machinery! (ask us  about this) (Yuying for the win!)
     
  • Developing and testing our shell-script to run our postCollation pipeline (Yuying!)
     
  • Finalizing the Interface in React + Astro (Yuying's senior design project)
     
  • Roll credits: People page on the Variorum website

 

Spine and data coordination

From collation data to spine

 

  • “Spine” = data model (dynamic nerve plexus?) holding the variorum together
    • standoff use of TEI critical apparatus
      • coordinates data on variance, including normalized tokens and maximum edit-distance values 
    • points to specific locations in the variorum edition files

The FV Spine: visualized in the interactive heatmap

Summary of the version history

  • 1816 notebooks to 1818: uneven (gaps in notebooks)
     
  • 1823 edition: Besides adding his daughter's name to the title page, William Godwin makes small edits, usually not substantive.
     
  • Thomas "fork" divergence:
    • copy with margin notes was left in Italy in 1823 when MWS moved back to England
    • edits mark desired passages to delete and significant additions MWS imagined making "if there were to be a new edition"
    • several interesting forks . . .
       
  • 1831 revisions:
    • nothing directly retained from Thomas marginalia
    • alters character relationships in the Frankenstein family, added chapter and several lengthened passages

Variant Passages of Interest

A Thomas copy edit of Letter IV at an early moment of intense revision

Variant Passages of Interest

where the Creature comes to life in MS and Thomas

Variant Passages of Interest

A Thomas copy edit not taken up later

Variant Passages of Interest

Where the MS notebooks begin, just after "Everyone adored Elizabeth. . ."

Variant Passages of Interest

An (in)famous enormous overhaul for 1831

Variant Passages of Interest

"I feared the vengeance of the disappointed fiend..." https://frankensteinvariorum.org/viewer/1831/chapter_xviii#C24_app15

A passage marking a journey from the MS to 1831

Publishing a typical TEI digital scholarly edition, today

Browser, UI & UX

Webapp logic

Database

Server

“not all projects should be maintained in perpetuity. Some are ... not worth the intellectual, technical, and financial overhead of ongoing maintenance.”
(Smithies et al. 2019)

 

Smithies, James, Carina Westling, Anna-Maria Sichani, Pam Mellen, and Arianna Ciula. 2019. “Managing 100 Digital Humanities Projects: Digital Scholarship & Archiving in King’s Digital Lab.” Digital Humanities Quarterly 013 (1).

Those in charge of infrastructure are also determining, particularly in the long term, the scholarly worth of a project, whether it should remain online, and in what form.

Less infrastructure = less expense to maintain

Browser, UI & UX: React

Static site

generator:
Node and Astro JavaScript

Server: GitHub

With free, open source resources available, we scholars / students may help old projects find new life.

The Variorum in your local web browser

  • Built from free, open-source code
  • Integrates XML and JavaScript in a static, minimal website. (No database!)
     
  • Page routing across 146 chapters among five different editions
     
    • Interactive user interface for
      • selectors
      • hotspots
      • variations

See details on our About and Methods pages


Link to these slides: https://bit.ly/fv-explore

Please dive in and explore (mobile friendly)!
https://frankensteinvariorum.org/

Thanks for listening! :-)

Reading and Exploring Five Versions of Frankenstein Digitally

By Elisa Beshero-Bondar

Reading and Exploring Five Versions of Frankenstein Digitally

Presentation for the DH2024 conference on the now complete Frankenstein Variorum project, with emphasis on theory of edition as expressed in the structure and interface.

  • 48