Visualizing the Frankenstein Variorum:
Navigating Five Versions of Mary Shelley’s Frankenstein (in 15 minutes!)
For the Penn State Behrend Faculty Showcase
30 October 2024 @ 5:15pm, Reed 114
Link to these slides: https://bit.ly/fv-vis-15



Presentation by Elisa Beshero-Bondar
@epyllia on bsky.social and indieweb.social |
linkedin.com/in/elisa-beshero-bondar/ | @ebeshero on GitHub
Objectives of the Frankenstein Variorum
-
to “upcycle” and connect previous digital editions of Frankenstein
-
to share a nonlinear, divergent edition history
- to encourage exploration from one edition to the others
Variorum - for tracking and comparing versions
Most immediate context: Darwin Online (ed. Barbara Bordalejo), except...
- Frankenstein Variorum compares five versions (to Darwin Online's six)
- Frankenstein Variorum incorporates MS witnesses
- Frankenstein Variorum integrates earlier digital editions made by others

FV as a Variorum
- Visualizes a collation, or comparison of versions, working with digital editions that were encoded very differently
- Designed as a static website for serendipitous browsing and intensive research
- Applies the TEI in a JavaScript context to store comparison data and pointers to variant passages
James Rieger, ed., first new edition of 1818 in 141
years : inline collation of "Thomas" w/ 1818,
1831 variants in endnotes
Legend:
Stuart Curran and Jack Lynch: PA Electronic Edition (PAEE) , collation of 1818 and 1831: HTML
Nora Crook crit. ed of 1818, variants of "Thomas", 1823, and 1831 in endnotes (P&C MWS collected works)
Romantic Circles TEI conversion of PAEE ; separates the texts of 1818 and 1831; collation via Juxta
1974
~mid-1990s
1996
Charles Robinson, The Frankenstein Notebooks (Garland): print facsimile of 1816 ms drafts
2007
Shelley-Godwin Archive publishes diplomatic edition of 1816 ms drafts
print edition
digital edition
Legend:
2013
2017
Critical and Diplomatic Editions Leading to the Frankenstein Variorum Project
Frankenstein Variorum Project begins
assembly/proof-correcting of PAEE files; OCR/proof-correcting 1823; "bridge" TEI edition of S-GA notebook files; automated collation; incorporating "Thomas" copy text. Collation project completed in 2023, Variorum viewer officially launches in 2024.
FV includes Shelley-Godwin Archive encoding
- S-GA diplomatic edition of the 1816 Notebooks,
- encoded surface-by-surface, line-by-line
-
required resequencing to collate
Shelley-Godwin Archive: sample page surface:


Shelley-Godwin Archive
sample surface encoding from S-GA
<surface xmlns:mith="http://mith.umd.edu/sc/ns1#" lrx="3847" lry="5342"
partOf="#ox-frankenstein_volume_i" ulx="0" uly="0"
mith:folio="21r" mith:shelfmark="MS. Abinger c. 56"
xml:base="https://raw.githubusercontent.com/
umd-mith/sga/master/data/tei/ox/ox-ms_abinger_c56/ox-ms_abinger_c56-0045.xml"
xml:id="ox-ms_abinger_c56-0045">
<graphic url="http://shelleygodwinarchive.org/images/ox/ms_abinger_c56/ms_abinger_c56-0045.jp2"/>
<zone rend="bordered" type="pagination"><line>75</line></zone>
<zone type="library"><line>21</line></zone>
<!-- lines of text elided here -->
<line>to form. His limbs were in proportion</line>
<line>and I had selected his features <del rend="strikethrough">h</del> as</line>
<line><mod>
<del rend="strikethrough">handsome</del>
<del rend="unmarked">.</del>
<anchor xml:id="c56-0045.01"/>
</mod>
<mod>
<del rend="strikethrough">Handsome</del>
<add hand="#pbs" place="superlinear">Beautiful</add>
</mod>; Great God! His</line>
<!-- at the end of the surface encoding, encoding material in a left-margin zone: --->
<zone corresp="#c56-0045.01" type="left_margin">
<line><add><mod>
<del rend="strikethrough">handsome</del>
<add hand="#pbs" place="superlinear">beautiful.</add>
</mod></add></line>
</zone>
<!-- other marginal insertions encoded -->
</surface>
S-GA: resequenced / compressed for collation
<surface lrx="3847" lry="5342"
partOf="#ox-frankenstein_volume_i"
ulx="0" uly="0" folio="21r" shelfmark="MS. Abinger c. 56" base="ox-ms_abinger_c56/ox-ms_abinger_c56-0045.xml"
id="ox-ms_abinger_c56-0045" sID="ox-ms_abinger_c56-0045"/>
<graphic url="http://shelleygodwinarchive.org/images/ox/ms_abinger_c56/ms_abinger_c56-0045.jp2"/>
<zone type="main" sID="c56-0045__main"/>
<lb n="c56-0045__main__17"/>
<del rend="strikethrough" sID="c56-0045__main__d2e9811"/>But how<del eID="c56-0045__main__d2e9811"/> How can I describe
my <lb n="c56-0045__main__18"/> emotion at this catastrophe; or how
<w ana="start"/>deli<lb n="c56-0045__main__19"/>neate<w ana="end"/>
the wretch whom with such <lb n="c56-0045__main__20"/> infinite pains and care I had endeavoured <lb n="c56-0045__main__21"/> to form. His limbs were in proportion <lb n="c56-0045__main__22"/> and I had selected his features <del rend="strikethrough" sID="c56-0045__main__d2e9830"/>h<del eID="c56-0045__main__d2e9830"/> as <lb n="c56-0045__main__23"/>
<mod sID="c56-0045__main__d2e9835"/>
<del rend="strikethrough" sID="c56-0045__main__d2e9837"/>handsome<del eID="c56-0045__main__d2e9837"/>
<mdel>.</mdel>
<anchor xml:id="c56-0045.01"/>
<zone corresp="#c56-0045.01" type="left_margin" sID="c56-0045__left_margin"/>
<lb n="c56-0045__left_margin__1"/>
<add sID="c56-0045__left_margin__d2e9849"/>
<mod sID="c56-0045__left_margin__d2e9851"/>
<del rend="strikethrough" sID="c56-0045__left_margin__d2e9853"/>handsome<del eID="c56-0045__left_margin__d2e9853"/>
<add hand="#pbs" place="superlinear" sID="c56-0045__left_margin__d2e9856"/>beautiful.<add eID="c56-0045__left_margin__d2e9856"/>
<mod eID="c56-0045__left_margin__d2e9851"/>
<add eID="c56-0045__left_margin__d2e9849"/>
<zone eID="c56-0045__left_margin"/>
<mod eID="c56-0045__main__d2e9835"/>
<mod sID="c56-0045__main__d2e9863"/>
<del rend="strikethrough" sID="c56-0045__main__d2e9865"/>Handsome<del eID="c56-0045__main__d2e9865"/>
<add hand="#pbs" place="superlinear" sID="c56-0045__main__d2e9868"/>Beautiful<add eID="c56-0045__main__d2e9868"/>
<mod eID="c56-0045__main__d2e9863"/>; Great God! His <lb n="c56-0045__main__24"/>
- added word boundary markup to indicate whole words spanning lines
- resequenced margin zone content: (followed S-GA's pointers to represent semantic reading order for collation)

Preparing for collation



Collating when the editions are so different (1)
Align and “chunk”
- Best not to collate the entire novel files to prevent severe alignment errors!
- We prepared 33 collation units (or "chunk files") sharing common starting and ending points.
- Edition files of the same chunk are collated together


How do the five editions “stack up” by collation chunk?
Legend
MS
1818
Thm
1823
1831
gaps, alignments, relative string-length for each ”chunk”
Collating when the editions are so different (2)
Prescribe rules to direct the machine-assisted collation
-
Our Python collation script
- works with collateX library, extensively customized
- Prepares collateX to work around markup differences
- (identify and unite words split around line-endings in S-GA)
-
to identify what features can be ignored/skipped over for collation purposes
- (e.g. markup of pagination, line-by-line encoding in S-GA)
-
to normalize: identify what apparently different features are the same:
- <milestone type='paragraph'> is same as
<p>
"&"
is not different from"and"
- <milestone type='paragraph'> is same as
-
Prescribes output in form of TEI critical apparatus :
- coordinate information on which editions align and what normalized tokens/strings they share at each instance of variation.
- (See Parallel Segmentation encoding in TEI Guidelines)
-
Markup of text structure compared across Variorum:
- Volume (print editions only), letter, chapter
- Paragraph, poetry line-groups and lines
- Notes
- Markup of manuscript events included in Variorum comparison: deletion, insertion, gap
-
Normalizing algorithm:
- Decide what marks are equivalent)
- Ignore but preserve other markup in collation process, also abbreviations, capitalization.
Normalized strings to compare
MS (from Shelley-Godwin Archive):
It was on a dreary night of November that I beheld
<del>the frame on whic</del> my man
comple<del>at</del>teed
1818 (from PA Electronic edition)
<p>IT was on a dreary
night of November, that I beheld
the accomplishment of my toils.</p>
Including markup in the comparison
Manuscript (from Shelley-Godwin Archive):
<lb n="c56-0045__main__2"/>It was on a dreary night of November
<lb n="c56-0045__main__3"/>that I beheld <del rend="strikethrough"
xml:id="c56-0045__main__d5e9572">
<add hand="#pbs" place="superlinear" xml:id="c56-0045__main__d5e9574">the frame on
whic</add></del> my man comple<del>at</del>
<add place="intralinear" xml:id="c56-0045__main__d5e9582">te</add>
<add xml:id="c56-0045__main__d5e9585">ed</add>
1818 (from PA Electronic edition)
<p xml:id="novel1_letter4_chapter4_div4_div4_p1">I<hi>T</hi> was on a dreary
night of November, that I beheld the accomplishment of my toils.</p>
- What matters for meaningful comparison?
- Text nodes
-
<del>
and<p>
markup
- What doesn't matter?
-
<lb/>
elements, attribute nodes - <hi>? *In real life we include the
<hi>
elements as meaningful markup because sometimes they are meaningful for emphasis.
-
Tokenize them!
MS (from Shelley-Godwin Archive):
["It", "was", "on", "a", "dreary",
"night", "of". "November", "that",
"I", "beheld"
"<del>the frame on whic</del>",
"my", "man",
"comple", "<del>at</del>", "teed"]
1818 (from PA Electronic edition)
["<p>", "IT", "was", "on", "a", "dreary",
"night", "of", "November,", "that", "I", "beheld",
"the", "accomplishment", "of", "my", "toils.", "</p>"]
Project decision: Treat a deletion as a complete and indivisible event:
a ”long token”. This helps to align other witnesses around it.
Nodes on the other side of collation
Real output from the project
(Embedded markup is a little more complicated than our previous example)
<app>
<rdgGrp n="['that', 'i', 'beheld']">
<rdg wit="f1818">that I beheld</rdg>
<rdg wit="f1823">that I beheld</rdg>
<rdg wit="fThomas">that I beheld</rdg>
<rdg wit="f1831">that I beheld</rdg>
<rdg wit="fMS"><lb n="c56-0045__main__3"/>that I beheld</rdg>
</rdgGrp>
</app>
<app>
<rdgGrp n="['<del> the frame on whic</del>',
'my', 'man', 'comple',
'', '<mdel>at</mdel>', 'te', 'ed',
',', '.', '<del>and</del>']">
<rdg wit="fMS"><del rend="strikethrough"
xml:id="c56-0045__main__d5e9572">
<sga-add hand="#pbs" place="superlinear"
sID="c56-0045__main__d5e9574"/>the
frame on whic <sga-add eID="c56-0045__main__d5e9574"/> </del> my man
comple <mod sID="c56-0045__main__d5e9578"/>
<mdel>at</mdel>
<sga-add place="intralinear" sID="c56-0045__main__d5e9582"/>te
<sga-add eID="c56-0045__main__d5e9582"/>
<sga-add sID="c56-0045__main__d5e9585"/>ed
<sga-add eID="c56-0045__main__d5e9585"/>
<mod eID="c56-0045__main__d5e9578"/>
<sga-add hand="#pbs" place="intralinear"sID="c56-0045__main__d5e9588"/>,
<sga-add eID="c56-0045__main__d5e9588"/>.
<del rend="strikethrough"
xml:id="c56-0045__main__d5e9591">And</del></rdg>
</rdgGrp>
<rdgGrp n="['the', 'accomplishment', 'of', 'my', 'toils.']">
<rdg wit="f1818">the accomplishment of my toils.</rdg>
<rdg wit="f1823">the accomplishment of my toils.</rdg>
<rdg wit="fThomas">the accomplishment of my toils.</rdg>
<rdg wit="f1831">the accomplishment of my toils.</rdg>
</rdgGrp>
</app>
Background image created by the author from a loom on Reddit and the frontispiece illustration of Frankenstein (1831)
CAUTION: Collation of heavily altered documents leads to many tangles and snags.
Completing this project was not possible without students!
Students led the way!
- Exploring the development of contextual annotations (Stephen, Jack and Avery at CMU)
-
Help track the kinds of errors we would find in the collation in our collationWorkspace (Nate and Rachel)
-
Find algorithmic ways to debug collation tangles (Mia, Jackie, Yuying, Nate, and Rachel)
- Create “long-tokens“ to pull heavily revised passages and long deletions away from the collation machinery! (ask us about this) (Yuying for the win!)
-
Developing and testing our shell-script to run our postCollation pipeline (Yuying!)
-
Finalizing the Interface in React + Astro (Yuying's senior design project)
- Roll credits: People page on the Variorum website
- Collation projects take much longer to debug than you ever expected
- Correct the input machinery, not the output.
- Minimize brittle hand-correction!
- Work on the pre-processing.
- Refine post-processing to correct output errors!
- Machine-assisted processes need a lot of documentation
- for project sustainability
- for reproducibility of data

- Look for ways to involve students!
- especially undergrads unfamiliar with the tech
- forces clear communication from everyone!
- best way to simplify overly complicated processes
- major skill building for all!

Summary of the version history
-
1816 notebooks to 1818: uneven (gaps in notebooks)
-
1823 edition: Besides adding his daughter's name to the title page, William Godwin makes small edits, usually not substantive.
-
Thomas "fork" divergence:
- copy with margin notes was left in Italy in 1823 when MWS moved back to England
- edits mark desired passages to delete and significant additions MWS imagined making "if there were to be a new edition"
- several interesting forks . . .
-
1831 revisions:
- nothing directly retained from Thomas marginalia
- alters character relationships in the Frankenstein family, added chapter and several lengthened passages
Variant Passages of Interest
A Thomas copy edit of Letter IV at an early moment of intense revision
Variant Passages of Interest
where the Creature comes to life in MS and Thomas
Variant Passages of Interest
A Thomas copy edit not taken up later
Variant Passages of Interest
Where the MS notebooks begin, just after "Everyone adored Elizabeth. . ."
Variant Passages of Interest
An (in)famous enormous overhaul for 1831
Variant Passages of Interest
"I feared the vengeance of the disappointed fiend..." https://frankensteinvariorum.org/viewer/1831/chapter_xviii#C24_app15
A passage marking a journey from the MS to 1831
Spine and data coordination
From collation data to spine
-
“Spine” = data model (dynamic nerve plexus?) holding the variorum together
- standoff use of TEI critical apparatus
- coordinates data on variance, including normalized tokens and maximum edit-distance values
- points to specific locations in the variorum edition files
- standoff use of TEI critical apparatus

