Visualizing the Frankenstein Variorum

Workshop GREN-CRIHN « New Perspectives on Critical Editions »
3 April 2025 @ Université de Montréal

Link to these slides: https://bit.ly/fv-vis

Elisa Beshero-Bondar
         @ebeshero |  Mastodon: epyllia@indieweb.social | Bluesky: @epyllia 
  • Frankenstein Variorum (say it 3 times fast!)
  • Lets you explore 5 distinct versions of the novel Frankenstein
  • Runs on your local computer / device with JavaScript in your web browser.

1. Try clicking anywhere in the heatmap to take you to a passage of significant revision.

 

2. Now, click on the editions the critical apparatus side panel to see how that passage  looks in each other version.

 

Objectives of the Frankenstein Variorum

 

  • to “upcycle” and connect previous digital editions of Frankenstein 
     
  • to share a nonlinear, divergent edition history
     
  • to encourage exploration from one edition to the others
     
  • to harmonize differently encoded source editions through the TEI

 

 

FV as a Variorum

  • Visualizes a collation of versions, working with digital editions that were encoded very differently
     
  • Our document data model, a “cosmopolitan encoding” of the TEI stand-off critical apparatus, structures the backend and the frontend.
     
  • Digital scholarly edition resource
    • designed as a static website for serendipitous browsing and intensive research
    • stores comparison data and pointers to variant passages.

FV has something to prove about TEI

  • That we can collate digital editions that were encoded very differently;
  • That the TEI standoff critical apparatus can express this comparison;
  • That multiple different approaches to encoding an edition can be processed computationally, through the TEI.
  • That any digital edition of Frankenstein can be compared to track significant changes over time:
    • TEI XML editions marked with <surface>, <zone>, and <milestone/> with a separate page by page encoding can be compared with editions encoded structurally with <div> and <p>...
    • ...as long as we can identify where they mark comparable structural shifts.

FV's five editions

  • 1816 notebooks to 1818 first edition: uneven (gaps in notebooks)
     
  • 1823 edition: Mary Shelley's father makes small edits, usually not substantive
     
  • Thomas "fork" divergence:
    • copy with margin notes was left in Italy before 1823, 1823
    • edits mark desired passages to delete and significant additions MWS imagined making "if there were to be a new edition"
    • left behind in Italy in 1823 when MWS moved back to England.
       
  • 1831 revisions:
    • nothing directly retained from Thomas marginalia
    • alters character relationships in the Frankenstein family, added chapter and several lengthened passages

  James Rieger, ed., first new edition of 1818 in 141

  years :   inline collation of "Thomas" w/ 1818,

  1831 variants in endnotes

Legend:

Stuart Curran and Jack Lynch: PA Electronic Edition (PAEE) , collation of 1818 and 1831: HTML

Nora Crook crit. ed of 1818,  variants of "Thomas",   1823, and 1831 in endnotes (P&C MWS collected works)

Romantic Circles TEI conversion of PAEE ; separates the texts of 1818 and 1831; collation via Juxta.

1974

~mid-1990s

1996

Charles Robinson, The Frankenstein Notebooks (Garland): print facsimile of 1816 ms drafts

2007

Shelley-Godwin Archive publishes diplomatic edition of 1816 ms drafts

print edition

digital edition

Legend:

2013

2017

Critical and Diplomatic Editions Leading to the Frankenstein Variorum Project

Frankenstein Variorum Project begins 

assembly/proof-correcting of PAEE files; OCR/proof-correcting 1823; "bridge" TEI edition of S-GA notebook files; automated collation; incorporating "Thomas" copy text. Collation project completed in 2023, Variorum viewer officially launches in 2024.

FV includes PA Electronic Edition (mid 1990s): 1818 vs 1831

  • started from base HTML 1.0 files
  • up-converted to clean, simple XML
    • ”on its way” to TEI (structural elements in text)
    • prepared for machine-assisted collation (via collateX): including element tags
    • deep hierarchy of novel ”flattened” to milestones: <div type="volume"/>, <p/>, etc.
  • corrected against photofacsimiles of 1818 and 1831 print publications

FV includes Shelley-Godwin Archive encoding

  • S-GA diplomatic edition of the 1816 Notebooks,
    • encoded surface-by-surface, line-by-line
    • required resequencing to collate
       

Shelley-Godwin Archive: sample page surface:

Shelley-Godwin Archive

sample surface encoding from S-GA

<surface xmlns:mith="http://mith.umd.edu/sc/ns1#" lrx="3847" lry="5342" 
partOf="#ox-frankenstein_volume_i" ulx="0" uly="0" 
mith:folio="21r" mith:shelfmark="MS. Abinger c. 56" 
xml:base="https://raw.githubusercontent.com/
umd-mith/sga/master/data/tei/ox/ox-ms_abinger_c56/ox-ms_abinger_c56-0045.xml" 
xml:id="ox-ms_abinger_c56-0045">
  <graphic url="http://shelleygodwinarchive.org/images/ox/ms_abinger_c56/ms_abinger_c56-0045.jp2"/>
  <zone rend="bordered" type="pagination"><line>75</line></zone>
  <zone type="library"><line>21</line></zone>
<!-- lines of text elided here -->
<line>to form. His limbs were in proportion</line>
<line>and I had selected his features <del rend="strikethrough">h</del> as</line>
<line><mod>
        <del rend="strikethrough">handsome</del>
        <del rend="unmarked">.</del>
        <anchor xml:id="c56-0045.01"/>
      </mod>
      <mod>
        <del rend="strikethrough">Handsome</del>
        <add hand="#pbs" place="superlinear">Beautiful</add>
      </mod>; Great God! His</line>

<!-- at the end of the surface encoding, encoding material in a left-margin zone:  --->

<zone corresp="#c56-0045.01" type="left_margin">
    <line><add><mod>
          <del rend="strikethrough">handsome</del>
          <add hand="#pbs" place="superlinear">beautiful.</add>
        </mod></add></line>
  </zone>
<!-- other marginal insertions encoded -->
</surface>

New digital editions

in the Frankenstein Variorum

FV: new edition of “Thomas copy”

 

  • Represents EBB's personal consultation with the Huntington Library MS.
     
  • Consulted James Rieger's and Nora Crook's print editions interpreting Mary Shelley's marginalia.
     
  • Added insertions, deletions, + margin-notes to the 1818 edition
     
  • Prepared new markup from 1818 edition, with <add>, <del>, <note> elements showing Thomas marginalia

FV: New 1823 edition

 

  • 1823: prepared by William Godwin, the first published edition bearing the name ”Mary Wollstonecraft Shelley” on the title page
     
  • Carnegie Mellon University librarians prepared OCR of the 1823 edition for our project
     
  • Our encoding for 1823 matches that of 1818 and 1831 editions (structure of letters, chapters, paragraphs, poems, annotations.)

Preparing for collation

Collating when the editions are so different (1)

Align and “chunk”

  • Best not to collate the entire novel files to prevent severe alignment errors!
  • We prepared 33 collation units (or "chunk files") sharing common starting and ending points.  
  • Edition files of the same chunk are collated together

S-GA: resequenced / compressed for collation

<surface lrx="3847" lry="5342" 
partOf="#ox-frankenstein_volume_i" 
ulx="0" uly="0" folio="21r" shelfmark="MS. Abinger c. 56" base="ox-ms_abinger_c56/ox-ms_abinger_c56-0045.xml" 
id="ox-ms_abinger_c56-0045" sID="ox-ms_abinger_c56-0045"/>
      <graphic url="http://shelleygodwinarchive.org/images/ox/ms_abinger_c56/ms_abinger_c56-0045.jp2"/>
      <zone type="main" sID="c56-0045__main"/> 

<lb n="c56-0045__main__17"/> 
         <del rend="strikethrough" sID="c56-0045__main__d2e9811"/>But how<del eID="c56-0045__main__d2e9811"/> How can I describe
      my <lb n="c56-0045__main__18"/> emotion at this catastrophe; or how 

<w ana="start"/>deli<lb n="c56-0045__main__19"/>neate<w ana="end"/> 

the wretch whom with such <lb n="c56-0045__main__20"/> infinite pains and care I had endeavoured <lb n="c56-0045__main__21"/> to form. His limbs were in proportion <lb n="c56-0045__main__22"/> and I had selected his features <del rend="strikethrough" sID="c56-0045__main__d2e9830"/>h<del eID="c56-0045__main__d2e9830"/> as <lb n="c56-0045__main__23"/> 
         <mod sID="c56-0045__main__d2e9835"/>
            <del rend="strikethrough" sID="c56-0045__main__d2e9837"/>handsome<del eID="c56-0045__main__d2e9837"/>
            <mdel>.</mdel>
            <anchor xml:id="c56-0045.01"/>
            <zone corresp="#c56-0045.01" type="left_margin" sID="c56-0045__left_margin"/> 
               <lb n="c56-0045__left_margin__1"/> 
               <add sID="c56-0045__left_margin__d2e9849"/>
                  <mod sID="c56-0045__left_margin__d2e9851"/>
                     <del rend="strikethrough" sID="c56-0045__left_margin__d2e9853"/>handsome<del eID="c56-0045__left_margin__d2e9853"/>
                     <add hand="#pbs" place="superlinear" sID="c56-0045__left_margin__d2e9856"/>beautiful.<add eID="c56-0045__left_margin__d2e9856"/>
                  <mod eID="c56-0045__left_margin__d2e9851"/>
               <add eID="c56-0045__left_margin__d2e9849"/>
            <zone eID="c56-0045__left_margin"/>
         <mod eID="c56-0045__main__d2e9835"/>
         <mod sID="c56-0045__main__d2e9863"/>
            <del rend="strikethrough" sID="c56-0045__main__d2e9865"/>Handsome<del eID="c56-0045__main__d2e9865"/>
            <add hand="#pbs" place="superlinear" sID="c56-0045__main__d2e9868"/>Beautiful<add eID="c56-0045__main__d2e9868"/>
         <mod eID="c56-0045__main__d2e9863"/>; Great God! His <lb n="c56-0045__main__24"/> 
  • added word boundary markup to indicate whole words spanning lines
  • resequenced margin zone content: (followed S-GA's pointers to represent semantic reading order for collation)

Collating when the editions are so different (2)

Prescribe rules to direct the machine-assisted collation
 

  • Our Python collation script 
    • works with collateX library, extensively customized
    • Prepares collateX to work around markup differences
      •   (identify and unite words split around line-endings in S-GA)
    • to identify what features can be ignored/skipped over for collation purposes
      • (e.g. markup of pagination, line-by-line encoding in S-GA)
    • to normalize: identify what apparently different features are the same:
      • <milestone type='paragraph'> is same as <p>
      •  "&" is not different from "and"  
    •  Outputs an XML data structure (TEI critical apparatus) :
  • Markup of text structures compared across Variorum:  
    • Volume (print editions only), letter, chapter
    • Paragraph, poetry line-groups and lines
    • Notes
  • Markup of manuscript events included in Variorum: Deletions, Insertions, Gaps
  • Normalizing algorithm:
    • Decide what marks are equivalent)
    • Ignore but preserve other markup in collation process, also abbreviations, capitalization.  

Normalized strings to compare

MS (from Shelley-Godwin Archive):

It was on a dreary night of November that I beheld 
&lt;del&gt;the frame on whic&lt;/del&gt; my man 
comple&lt;del&gt;at&lt;/del&gt;teed

1818 (from PA Electronic edition)

&lt;p&gt;IT was on a dreary 
night of November, that I beheld
the accomplishment of my toils.&lt;/p&gt;

Including markup in the comparison

Manuscript (from Shelley-Godwin Archive):

<lb n="c56-0045__main__2"/>It was on a dreary night of November 
<lb n="c56-0045__main__3"/>that I beheld <del rend="strikethrough" 
xml:id="c56-0045__main__d5e9572">
       <add hand="#pbs" place="superlinear" xml:id="c56-0045__main__d5e9574">the frame on
         whic</add></del> my man comple<del>at</del>
<add place="intralinear" xml:id="c56-0045__main__d5e9582">te</add>
<add xml:id="c56-0045__main__d5e9585">ed</add>

1818 (from PA Electronic edition)

<p xml:id="novel1_letter4_chapter4_div4_div4_p1">I<hi>T</hi> was on a dreary 
night of November, that I beheld the accomplishment of my toils.</p>
  • What matters for meaningful comparison?
    • Text nodes
    • <del> and <p> markup
  • What doesn't matter?
    • <lb/> elements, attribute nodes
    • <hi>? *In real life we include the <hi> elements as meaningful markup because sometimes they are meaningful for emphasis.

Tokenize them!

MS (from Shelley-Godwin Archive):

["It", "was", "on", "a", "dreary", 
"night", "of". "November", "that", 
"I", "beheld" 
"&lt;del&gt;the frame on whic&lt;/del&gt;",
"my", "man", 
"comple", "&lt;del&gt;at&lt;/del&gt;", "teed"]

1818 (from PA Electronic edition)

["&lt;p&gt;", "IT", "was", "on", "a", "dreary", 
"night", "of", "November,", "that", "I", "beheld",
"the", "accomplishment", "of", "my", "toils.", "&lt;/p&gt;"]

Project decision: Treat a deletion as a complete and indivisible event:

a ”long token”. This helps to align other witnesses around it.

Nodes on the other side of collation

Real output from the project

(Embedded markup is a little more complicated than our previous example)

<app>
	<rdgGrp n="['that', 'i', 'beheld']">
		<rdg wit="f1818">that I beheld</rdg>
		<rdg wit="f1823">that I beheld</rdg>
		<rdg wit="fThomas">that I beheld</rdg>
		<rdg wit="f1831">that I beheld</rdg>
		<rdg wit="fMS">&lt;lb n="c56-0045__main__3"/&gt;that I beheld</rdg>
	</rdgGrp>
</app>
<app>
	<rdgGrp n="['&lt;del&gt; the frame on whic&lt;/del&gt;',
               'my', 'man', 'comple', 
               '', '&lt;mdel&gt;at&lt;/mdel&gt;', 'te', 'ed', 
               ',', '.', '&lt;del&gt;and&lt;/del&gt;']">
		<rdg wit="fMS">&lt;del rend="strikethrough" 
          xml:id="c56-0045__main__d5e9572"&gt;
			&lt;sga-add hand="#pbs" place="superlinear" 
          sID="c56-0045__main__d5e9574"/&gt;the
	      frame on whic &lt;sga-add eID="c56-0045__main__d5e9574"/&gt; &lt;/del&gt; my man
		  comple &lt;mod sID="c56-0045__main__d5e9578"/&gt; 
          &lt;mdel&gt;at&lt;/mdel&gt;
		  &lt;sga-add place="intralinear" sID="c56-0045__main__d5e9582"/&gt;te
          &lt;sga-add eID="c56-0045__main__d5e9582"/&gt;
          &lt;sga-add sID="c56-0045__main__d5e9585"/&gt;ed
		  &lt;sga-add eID="c56-0045__main__d5e9585"/&gt;
          &lt;mod eID="c56-0045__main__d5e9578"/&gt;
          &lt;sga-add hand="#pbs" place="intralinear"sID="c56-0045__main__d5e9588"/&gt;, 
          &lt;sga-add eID="c56-0045__main__d5e9588"/&gt;.
		  &lt;del rend="strikethrough"
		  xml:id="c56-0045__main__d5e9591"&gt;And&lt;/del&gt;</rdg>
	</rdgGrp>
	<rdgGrp n="['the', 'accomplishment', 'of', 'my', 'toils.']">
		<rdg wit="f1818">the accomplishment of my toils.</rdg>
		<rdg wit="f1823">the accomplishment of my toils.</rdg>
		<rdg wit="fThomas">the accomplishment of my toils.</rdg>
		<rdg wit="f1831">the accomplishment of my toils.</rdg>
	</rdgGrp>
</app>

Background image created by the author from a loom on Reddit and the frontispiece illustration of Frankenstein (1831)

CAUTION: Collation of heavily altered documents leads to many tangles and snags.

Completing this project was not possible without students!

Students led the way!

  • Exploring the development of contextual annotations (Stephen, Jack and Avery at CMU)
     
  • Help track the kinds of errors we would find in the collation in our collationWorkspace (Nate and Rachel)
     
  • Find algorithmic ways to debug collation tangles (Mia, Jackie, Yuying, Nate, and Rachel)
     
  • Create “long-tokens“ to pull heavily revised passages and long deletions away from the collation machinery! (ask us  about this) (Yuying for the win!)
     
  • Developing and testing our shell-script to run our postCollation pipeline (Yuying!)
     
  • Finalizing the Interface in React + Astro (Yuying's senior design project)
     
  • Roll credits: People page on the Variorum website

 

Spine and data coordination

From collation data to spine

 

  • “Spine” = data model (dynamic nerve plexus?) holding the variorum together
    • standoff use of TEI critical apparatus
      • coordinates data on variance, including normalized tokens and maximum edit-distance values 
    • points to specific locations in the variorum edition files

The FV Spine: visualized in the interactive heatmap

  • Not every variance is represented here...
  • ...only those that met a simple limited threshhold of variance
  • Intended to be a simple, serendipitous discovery aid...
  • ...and a simple visualization of the Variorum's structure
  • Constructed by XSLT transforming the TEI critical apparatus into SVG (XML).

Summary of the version history

  • 1816 notebooks to 1818: uneven (gaps in notebooks)
     
  • 1823 edition: Besides adding his daughter's name to the title page, William Godwin makes small edits, usually not substantive.
     
  • Thomas "fork" divergence:
    • copy with margin notes was left in Italy in 1823 when MWS moved back to England
    • edits mark desired passages to delete and significant additions MWS imagined making "if there were to be a new edition"
    • several interesting forks . . .
       
  • 1831 revisions:
    • nothing directly retained from Thomas marginalia
    • alters character relationships in the Frankenstein family, added chapter and several lengthened passages

Variant Passages of Interest

A Thomas copy edit of Letter IV at an early moment of intense revision

Variant Passages of Interest

where the Creature comes to life in MS and Thomas

Variant Passages of Interest

A Thomas copy edit not taken up later

Variant Passages of Interest

Where the MS notebooks begin, just after "Everyone adored Elizabeth. . ."

Variant Passages of Interest

An (in)famous enormous overhaul for 1831

Variant Passages of Interest

"I feared the vengeance of the disappointed fiend..." https://frankensteinvariorum.org/viewer/1831/chapter_xviii#C24_app15

A passage marking a journey from the MS to 1831

Publishing a typical TEI digital scholarly edition, today

Browser, UI & UX

Webapp logic

Database

Server

“not all projects should be maintained in perpetuity. Some are ... not worth the intellectual, technical, and financial overhead of ongoing maintenance.”
(Smithies et al. 2019)

 

Smithies, James, Carina Westling, Anna-Maria Sichani, Pam Mellen, and Arianna Ciula. 2019. “Managing 100 Digital Humanities Projects: Digital Scholarship & Archiving in King’s Digital Lab.” Digital Humanities Quarterly 013 (1).

Those in charge of infrastructure are also determining, particularly in the long term, the scholarly worth of a project, whether it should remain online, and in what form.

Less infrastructure = less expense to maintain

Browser, UI & UX: React

Static site

generator:
Node and Astro JavaScript

Server: GitHub

With free, open source resources available, we scholars / students may help old projects find new life.

The Variorum in your local web browser

  • Built from free, open-source code
  • Integrates XML and JavaScript in a static, minimal website. (No database!)
     
  • Page routing across 146 chapters among five different editions
     
    • Interactive user interface for
      • selectors
      • hotspots
      • variations

The Frankenstein Variorum Edition Built in TEI

  • Standoff TEI Spine with pointers to <seg> elements in the TEI files
    https://github.com/FrankensteinVariorum/fv-data/tree/master/2023-standoff_Spine)
     
  • Variorum edition "chapter-chunk" files (built from an XSLT pipeline from the spine, "stamped" with <seg> elements marking variant locations)
    https://github.com/FrankensteinVariorum/fv-data/tree/master/2023-variorum-chapters
     
  • Publishing a "build" currently triggers rebuild of JavaScript process: output the standoff spine in JSON + publish the TEI data "chapter-chunk" files (astro-cetacean)
     
  •  “Endings“ Compliance for a pointer-based critical edition?
    • Edition framework + data pointers between TEI files in shared space.
    • Publishing is separate from the TEI build process. We can pass the data to another publishing mechanism that can read it.


Link to these slides: https://bit.ly/fv-explore

Please dive in and explore (mobile friendly)!
https://frankensteinvariorum.org/

Thanks for listening! :-)

Visualizing the Frankenstein Variorum

By Elisa Beshero-Bondar

Visualizing the Frankenstein Variorum

Presentation on the now complete Frankenstein Variorum project, with emphasis on theory of edition as expressed in the structure and interface.

  • 175